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[57] ABSTRACT 

The invention pertains to a novel transcriptional activity 
factor, Tat-Stimulatory Factor, as well as genes encoding this 
factor and fragments and biologically functional variants 
thereof. The Tat-Stimulatory Factor is involved in the regu- 
lation of transcriptional elongation of HIV-1 by Tat. The 
invention also pertains to therapeutics involving the fore- 
going proteins and genes, and agents that bind to the 
foregoing proteins and genes. The invention also relates to 
methods of screening for a compound which binds to 
Tat-SFl, Tat-SFl -associated kinase and/or a complex of 
Tat-SFl and Tat-SFl-associated kinase, as well as methods 
of screening for compounds which modulate Tat-SFl- 
mediated transcriptional activation, 

13 Claims, 9 Drawing Sheets 
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TAT-SF: COFACTOR FOR STIMULATION OF The preferred nucleic acids of the invention are homo- 

TRANSCRIPTTONAL ELONGATION BY HIV- togues and alleles of toe coding region of the nucleic acid of 

1 XAT SEQ ID NO: 1. The invention further embraces functional 

equivalents, variants, analogues and fragments of the fore- 
This Appln. claims benefit of provisional appln. 60/021, 5 going nucleic acids and also embraces proteins and peptides 
218 Jul. 3, 1996 and provisional appln. 60/033,152 Dec. 13, coded for by any of the foregoing. 
1996. According to one particular aspect of the invention, an 

isolated nucleic acid molecule is provided. The molecule 
GOVERNMENT SUPPORT hybridizes under stringent conditions to a molecule consist- 

This work was funded in part by the National Institutes of 10 ing of the coding region of the nucleic acid sequence of SEQ 
Health under the Grant Nos. GM34277 and A1324S6, and ID NO:l and it codes for a Tat-Stimulatory Factor protein, 
the National Cancer Institute under Center core Grant No. The invention further embraces nucleic acid molecules that 
CA14051. The Government may retain certain rights in this differ from the foregoing isolated nucleic acid molecules in 
invention. codon sequence due to the degeneracy of the genetic code. 

15 The invention also embraces complements of the foregoing 
BACKGROUND OF THE INVENTION nucleic acids. Preferred isolated nucleic acid molecules are 

Intricate mechanisms regulate mRNA synthesis by con- ^f^S^ huma ? C ° NA * ° r ^ ms P 0Ddin S 

trol of initiation or elongation of transcription. An under- to f Q ID N0: .I- ^ , t° ^ 

standing of these mechanisms and the factors controlling m cculcs arc Really contemplated by the inventors, 

these mechanisms would be important in designing thera- The invention in another aspect involves expression 

peutic modalities for treating a variety of important medical vectors, and host cells transformed or transfected with such 

conditions, including cancer and infection. For example, expression vectors, comprising the nucleic acid molecules 

HIV-1 transcriptional elongation by Tat is essential for viral described above. In one embodiment of the invention, the 

replication. Interruption of transcriptional elongation by Tat, host cell is a hematopoietic T-cell precursor, such as a stem 

therefore, would be highly desirable as a means for treating cell > ^ the Ducleic acid 15 " antisense nucleic acid or a 

HIV-infected individuals nucleic acid encoding a dominant negative mutant of the 

Tat activation of HIV-1 transcription is mechanistically Tat-Stimulating Factor protein, 

different from conventional activation of transcription by According to another aspect of the invention, an isolated 

DNA sequence-specific transcription factors. First, most 30 nucleic acid molecule is provided which comprises a unique 

conventional activators affect transcription primarily fragment of SEQ ID NO:l. In one embodiment the unique 

through increasing the rate of initiation, although recent fragment is a portion of the segment of SEQ ID NO:l 

studies indicate that some prototype DNA sequence-specific consisting of SEQ ID NO:3. In another embodiment it is a 

transcription factors such as GAL4-VP16 can stimulate both portion o f the segment of SEQ ID NO:l txginning at 

initiation and elongation. In contrast, Tat predominantly 3S nucleotide number 53 and ending at nucleotide number 

stimulates the efficiency of elongation. Secondly, while most 27G *> wherein the fragment is between 12 and 2650 nucle- 

conventional activators interact with promoter or enhancer ot» d « in length, and complements thereof. In one 

DNA, Tat interacts with the trans-acting responsive (TAR) embodiment, the unique fragment is at least 150 and, more 

RNA element. TAR is located at the 5' end of the nascent preferably, at least 200 nucleotides in length. In another 

viral transcript and forms a stem-loop structure. The specific w embodiment, the unique fragment is between 12 and 32 

binding of Tat to TAR depends primarily upon the integrity contiguous nucleotides in length. In all embodiments the 

of the bulge loop and immediately flanking sequences in the unique fragment includes consecutive nucleotides of SEQ 

double-stranded RNA. Sequences in the apical loop of TAR ID NO: 1 other than the nucleotides of SEQ ID NO: 1 which 

are also important for Tat activation of transcription in vivo. codc for SE Q ID NO:3. 

Control of transcriptional elongation thus has been rec- 45 According to another aspect of the invention, isolated 

ognized as an important step in gene regulation, but mecha- polypeptides coded for by the isolated nucleic acid mol- 

nisms regulating the efficiency of elongation, mediated by ecules described above also are provided as weU as func- 

RNA polymerase II, have not been extensively studied. The uonal equivalents, variants, analogs and fragments thereof, 

necessity for strict control of elongation for proper gene In 0Q e embodiment, the polypeptide is a human Tat- 

regulation is further highlighted by the recent finding that an 50 Stimulatory Factor protein or a functionally active fragment 

elongation factor, Elongin, is probably the functional target or variant thereof. In another embodiment the polypeptide is 

of the von Hippel-Lindau tumor suppressor protein. * dominant negative mutant of a Tat-Stimulatory Factor 

protein. 

SUMMARY OF THE INVENTION The invention in another aspect involves a method for 
The invention involves in one respect the identification, ss influencing transcription in a cell. An agent that selectively 
purification, and isolation of proteins, Tat-Stimulatory Fac- bin* to an isolated nucleic acid molecule as described 
tor protein, which are specifically required for Tat trans- above or an expression produd thereof is introduced within 
activation. The invention also involves nucleic acid mol- a ccU, man amount effective to alter transcription in the cell, 
ecules encoding those proteins. The invention further Preferred agents are modified antisense nucleic acids and 
involves the discovery and identification of kinases that bind 60 polypeptides. In one embodiment, transcriptional elongation 
the Tat-Stimulatory Factor proteins, which binding is activity altered, and in one particularly important 
believed important for TAT transcriptional elongation. The embodiment, HIV-1 transcriptional elongation by Tat is 
expression and biological activity of the proteins are nec- altered. In this embodiment, the transcriptional elongation 
essary for transcriptional elongation, and alteration of the activity can be altered to treat an individual who is infected 
expression or biological activity of these proteins can be 65 by Ho- 
used to influence transcriptional activity, and thereby affect The invention in another aspect involves a method for 
critical cellular processes. isolating a kinase. A solution suspected of containing the 
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kinase is contacted with a Tat-Stimulaiory Factor protein or According to still another aspect of the invention, a 

functional fragment thereof, and a material that binds to the method for screening a compound which binds to a complex 

Tat-Stimulatory Factor protein and that has kinase activity is of Tat-SFl and Tat-SFl associated kinase is provided. A 

identified and isolated. complex of Tat-SFl and Tat-SFl associated kinase is con- 

The invention in a related aspect involves isolated kinases * tacled with the compound and the binding of the compound 

that arc the binding partners of Tat-Stimulatory Factor to ™e complex is determined. The compound can be detect- 

proteins, and nucleic acids which encode such kinases, as abl y labeled, and determining binding involves detecting the 

well as functional fragments, variants and analogs of the labeled compound bound to Tat-SFl. In another 

foregoing embodiment, determining the binding involves detecting a 

Tne invention also provides isolated polypeptides which 10 ch *°f * * e biological acUvity of the complex of Tat-SFl 

selectively bind a Tat Stimulatory FactoV protein, a kinase » nd ^'ted kinase. Preferably, a change m the 

binding partner of a Tat-Stimulatory Factor protein or frag- ^ f T Uy ° f COmpleX 15 dclerauncd b V ■ ™" 

ments thereof. Isolated binding polypeptides include anti- SF1 media ! ed ^Z^* 9 ™*' 30 of 

bodies and fragments of antibodies (e.g. Fab, Fd and of T ^ SF1 ™d Tat-SFl associated kmase, a Tat- 

antibody fragments which include a CDR3 region which 15 5^ """"J* **** substrate phosphorylation assay, or a 

... , ... „ . r . ~.. , t . , ■ Tat-SFl -TAR binding assay. In certain of the foregoing 

binds selectively to a Tat-Stimulatory Factor protein or 6 ' 6 7 s 

, . L J embodiments, the compound being screened which binds to 

rragment tnereoij the complex of Tat-SFl and Tat-SFl associated kinase is an 

The invention also contemplates gene therapy for HIV- oligonucleotide, 
infected individuals, wherein stem cells of a donor are * ' - . . 
genetically engineered to include an agent that selectively 20 ^™f a * t0 ^ an0lhcr Jf*. ° f l ^ T^? cV? 
binds to a nucleic acid molecule encoding a Tat-Stimulatory method fo ' "^pounds which modulate Tat-SFl 
Factor protein or an expression product thereof, whereby dependent transcnptional activation is provided. A mamma- 
said recombinant stem cells are resistant to intracellular HIV ^ « U ^includes a gene encoding a Tat-SFl polypep- 
^ tide is provided and contacted with the compound. Tat-SFl 

rc *L .... . , . , 25 mediated transcriptional activation is determined as a mea- 

The invention involves methods of screening for com- SUJC of mc modulation 1D Tat ^ F1 racdiatcd transcription 

pounds which bind to Tat-SFl, Tat-SFl assoaatedkinase, or cau5cd b ^ ^ ound p^f^iy, mc mam . 

a complex of Tat-SFl and its associated kmase. The inven- ^ ako a CDCoding a TAT p^p. 

lion also involves methods for screening for compounds ^ Mofc fcrabl mC mammaHan ^ also includes an 

wtnc ^modulate Tat-SFl -dependen^ tran^np .onal activa- 30 cDcodin aQ w 

tion. Compounds identified by the methods of the invention ^ a TAR clcmcnt ^ morc prcfcrablv , thc mam . 

are useful for detecting the presence of and/or modulatmg ^ a CQCodin a 

the activity of Tat-SFl Tat-SFl associated kmase, or a 1 tidc In ccrlain cmbodimcntSt ^y 0QC or 

complex of Tat-SFl and its associated kmase. mQn q( of >u rf ^ polypcptidc , thc TAT 

Accordmg to one aspect of the invention, a method of 35 polypeptide, the indicator gene product and/or the Tat-SFl 

screening for a compound which binds to Tat-SFl is pro- associated kinase polypeptide are encoded by transfected 

vided. Tat-SFl is contacted with the compound and the expression vectors 

binding of the compound to Tat-SFl is determined^ The , n embodiffientSt lhe indicator encodes beta . 

compound can be dctcctably labeled. In this preferred aDcaliDe phosphaU se, chloramphenicol acetyl 

embodiment, dctermming the binding involves detecting the 40 luciferase, or green fluorescent protein, 

labeled compound bound to Tat-SFl. In another , . ... , , 

embodiment, determination of the binding of the compound , . « r * m preferredcmbodiments, the TARclcment opcr- 

to Tat-SFl includes detecting a change in thc biological ^ ed to thc lDd,cator 8 enc 15 » mV l 

activity of Tat-SFl. Preferably, Tat-SFl biological activity is dement 

assayed by a Tat-SFl mediated transcription assay, a Tat- 45 ^ d other objects and features of the invention are 

SF1 immunoassay and/or a Tat-SFl -TAR binding assay in described in greater detail below, 

thc presence and absence of the compound. In certain of the BRIEF DESCRIPTION OF THE FIGURES 
foregoing embodiments, the compound being screened 

which binds to Tat-SFl is an oligonucleotide. ^GS. 1A-C shows the identification of Tat-SF activity in 

According to another aspect of the invention, a method for 50 «lhilar extracts. A. Tat activation of HIV transcriptional 

screening compounds which bind to Tat-SFl associated cl ?^T r ^ S t * "^'^f R , B ™?SS 

kinase is provided. The Tat-SFl associated kinase is con- ° f * c phosphorylated pp!40 on an immobilized HIV-1 TAR 

tacted with the compound and the binding of the compound RNA * f • ^ V? 6 ™** «u«non domain of Tat is 

to the Tat-SFl associated kinase is determined. In one re( l uired for PP 140 phosphorylation on TAR. 

embodiment, the compound isdetectably labeled, and deter- 55 ™S. 2A-B shows that Tat-SF transcriptional activity 

mining binding involves detecting the labeled compound aod PP 140 co-peaked during glycerol gradient sedimenta- 

bound to Tat-SFl. In another embodiment, determining the d ? n - A - Detection by transcription reaction. B. Detection by 

binding of the compound to the Tat-SFl associated kinase kinase assa y- 

involves detecting a change in the biological activity of the FIGS. 3A-C depicts purification of ppl40 and Tat-SF 

Tat-SFl associated kinase. Preferably, the change in the 60 transcriptional activity by Tat affinity columns. A. Eluates of 

biological activity of the Tat-SFl associated kinase is deter- the columns were tested for transcription activity. B. The 

mined by a Tat-SFl mediated transcription assay, a Tat-SFl same eluates were tested for the presence of ppl40 in a 

associated kinase immunoassay, or a Tat-SFl associated kinase assay. C. A silver-stained SDS gel of the above two 

kinase substrate phosphorylation assay. In certain of the fractions is shown. 

foregoing embodiments, the compound being screened 65 FIGS. 4A-C shows that the presence of ppl 40 is required 

which binds to the Tat-SFl associated kinase is an oligo- for Tat-SF activity. A. ppl40 and a cellular kinase form a 

nucleotide. complex independently of Tat. B. Immunodepletion of 
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ppl40 from a partially purified Tat-SF fraction inactivated is believed that Tat-Stimulatory Factor is required for Tat 

Tat-SF transcriptional activity. C Anti-ppl40 antibody effi- trans-activation. 

ciently removed ppl40 from the Tat-SF fraction. Tbe invention thus involves in one aspect Tat-Stimulatory 

FIGS. 5A-C. A. Amino acid sequence (SEQ ID NO:2) Factor proteins, genes encoding those proteins, functional 

and domain structure of Tat-SFl. B. Similarity between 5 modifications and variants of tbe foregoing, useful frag- 

Tat-SFl (amino acids 30-44, 82-122, 128-177, 178-217 ments of tbe foregoing, as well as therapeutics relating 

and 311-342 of SEO ID NO:2) and human EWS (amino thereto. 

acids 209-223, 113-153, 356-406, 407-446 and 409-440 of Homologs and alleles of the Tat-Stimulatory Factor 

SEQ ID NO:4). nucleic acids of the invention can be identified by conven- 

FIG. 6 shows that overexpression of Tat-SFl enhances Tat 10 tional techniques. Thus, an aspect of the invention is those 

activation in HeLa cells. nucleic acid sequences which code for Tat-Stimulatory Fac- 
tor proteins and which hybridize to a nucleic acid molecule 

BRIEF DESCRIPTION OF SEQUENCES consisting of the coding region of SEQ ID NO:l, under 

SEQ ID NO:l is a nucleic acid including the coding 1S suingent conditions. The term "stringent conditions" as used 

region of Tat-Stimulatory Factor. herem refere to Parameters with which the art is familiar. 

^■r.^,. l - . . .. , . Nucleic acid hybridization parameters may be found in 

SEQ1D NO:2stotiirtated ammo ac.d sequence of the refefeoces ^ ^ * MdeaOa* 

coding region oi iBQ IJJ NU:1. Cloning: A Laboratory Manual, J. Sambrook et al., eds., 

SEQ ID NO:3 is an expressed sequence tag, an amino acid Second cold Spring Harbor Laboratory Press, Cold 

sequence encoded by a portion of SEQ ID NO:l. 20 Sprmg HitboTf N Y> 19 g 0i or Current Protocols in Molecu- 

SEQ ID NO:4 is a portion of the amino acid sequence of lar Biology, F. M. Ausubel, et al., eds., John Wiley & Sons, 

EWS, which shows some homology with the amino acid of Inc., New York. More specifically, stringent condition, as 

SEQ ID NO:2. used herein, refers to hybridization at 65° C in hybridization 

SEQ ID NO:5 is a portion of the nucleic acid of SEQ ID buffer (3.5xSSC, 0.02% Ficoll, 0.02% polyvinyl 

NO: 1. 25 pyrrolidone, 0.02% Bovine Serum Albumin, 2.5 mM 

NaH 2 P0 4 (pH7), 0.5% SDS, 2 mM EDTA). SSC is 0.15M 

DETAILED DESCRIPTION OF THE sodium chloride/O.ISM sodium citrate, pH7; SDS is sodium 

INVENTION dodecyl sulphate; and EDTA is ethylenediaminetetracetic 

Using a reconstituted reaction that supports a TAR- 30 acid * A**" hybridization, the membrane upon which the 

dependent and Tat-specific activation of elongation, we have DNA * transferred is washed at 2xSSC at room temperature 

identified, purified, and isolated a cDNA for a novel cellular and then at O.lxSSGO.lxSDS at 65° C. 

activity, Tat-Stimulatory Factor, that is specifically required There arc other conditions, reagents, and so forth which 

for Tat trans-activation. This factor is a substrate of an can used, which result in a similar degree of stringency. The 

associated cellular kinase. Co-transfection with the cDNA 3S skiUed artisan will be familiar with such conditions, and thus 

for Tat-Stimulatory Factor specifically stimulates Tat acli- they are not given here. It will be understood, however, that 

vation of HIV transcription. Sequence analysis indicates that the skilled artisan will be able to manipulate the conditions 

Tat-Stimulatory Factor is related to EWS and FUS/TLS, ™ * manner to permit the clear identification of homologs 

which are members of a novel family of putative transcrip- and alleles of Tat-Stimulatory Factor nucleic acids of the 

tion factors with RNA recognition motifs and are frequently ^ invention. The skilled artisan also is familiar with the 

associated with many types of sarcomas. It is believed that methodology for screening cells and libraries for expression 

Tat activates the processivity of elongation by recruitment of of such molecules which then are routinely isolated, fol- 

a pre-formed complex containing Tat-Stimulatory Factor lowed by isolation of the pertinent nucleic acid molecule and 

and a kinase to the HIV-1 promoter through a Tat-TAR sequencing. 

interaction. 45 In general homologs and alleles typically will share at 
The mRNA transcript is about 3.0 kb in length, with an least 40% nucleotide identity and/or at least 50% amino acid 
open reading frame of 2271 bp. The open reading frame identity to the coding region of SEQ ID NOs: 1 or 2 (FIG. 5), 
encodes protein of 754 amino acids with a calculated respectively, in some instances will share at least 50% 
molecular weight of 85,767 Daltons. Sequence analysis of nucleotide identity and/or at least 65% amino acid identity 
tbe protein reveals that it has several unique features. The 50 and i° stu I otne r instances will share at least 60% nucleotide 
protein can be roughly divided at position 420 into two identity and/or at least 75% amino acid identity. Watson- 
halves. The COOH-lerminal half is extremely rich in acidic Crick complements of the forgoing nucleic acids also are 
amino acids, with 48% of the last 245 amino acid residues embraced by the invention. 

as glutamate or aspartate. The COOH-terminal half also In screening for Tat-Stimulatory Factor proteins, a South- 
contained many serine residues that are contained in a short 55 era blot may be performed using the foregoing conditions, 
peptide sequence matching consensus sites for phospbory- together with a radioactive probe. After washing the mem- 
la tion by Casein Kinase It. Such phosphorylation would braneto which the DNA is finally transferred, the membrane 
contribute more negative charges to this region. The NH 2 can be placed against x-ray film to detect the radioactive 
terminal half of Tat-Stimulatory Factor contains two tandem signal. 

RNA recognition motifs, which have homology to many 60 The invention also includes degenerate nucleic acids 

RNA-binding proteins. Further details about the protein are which include alternative codons to those present in the 

described in greater detail in the examples below. native materials. For example, serine residues are encoded 

It was determined that overexpression of Tat-Stimulatory by the codons TCA, AGT, TCC, TCG, TCT and AGC. Each 

Factor enhances Tat activation in vivo. Immunodepletion of of tbe six codons is equivalent for the purposes of encoding 

the Tal-Slimulatory Factor from a partially purified fraction 65 a serine residue. Thus, it will be apparent to one of ordinary 

containing Tat-Stimulatory Factor transcriptional activity skill in the art that any of the serine-encoding nucleotide 

eliminates its ability to support Tat trans-activation. Thus, it triplets may be employed to direct the protein synthesis 
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apparatus, in vitro or io vivo, to incorporate a serine residue. to inhibit Tat-Stimulatory Factor activity, as described 

Similarly, nucleotide sequence triplets which encode other below, etc. A unique fragment of a Tat-Stimulatory Factor 

amino acid residues include, but are not limited to,: CCA, protein, in general, has the features and characteristics of 

CCC, CCG and CCT (proline codons); CCA, CGC, CGG, unique fragments as discussed above in connection with 

CCT, AGA and AGC (arginine codons); ACA, ACC, ACG 5 nucleic acids. 

and ACT (threonine codons); AAC and AAT (asparagine As mentioned above, the invention embraces antisense 

codons); and ATA, ATC and ATT (isoleucine codons). Other oligonucleotides that selectively bind to a nucleic acid 

amino acid residues may be encoded similarly by multiple molecule encoding a Tat-Stimulatory Factor protein, to 

nucleotide sequences. Thus, the invention embraces degen- decrease transcription activity, and in particular transcrip- 

erate nucleic acids that differ from the biologically isolated 10 tional elongation. This is desirable in virtually any medical 

nucleic acids in codon sequence due to the degeneracy of the condition wherein a reduction in transcriptional elongation 

genetic code. is desirable, including to reduce HIV-1 transcriptional elon- 

The invention also provides isolated unique fragments of gallon by Tat. Antisense molecules, in this manner, can be 

SEQ ID NO:l or compliments of SEQ ID NO:l. A unique used to slow down or arrest the propagation of HIV in vivo, 

fragment is one that is a 'signature* for the larger nucleic 15 As used herein, the term "antisense oligonucleotide" or 

acid. It, for example, is long enough to assure that its precise "antisense" describes an oligonucleotide that is aD 

sequence is not found in molecules outside of the Tat- oligoriboouclcotide, otigodeoxyribomicleotide, modified 

Stimulatory Factor proteins defined above. Unique frag- oligoribo nucleotide, or modified oligodeoxyribonucleotide 

ments can be used as probes in Southern blot assays to which hybridizes under physiological conditions to DNA 

identify such proteins, or can be used in amplification assays 20 comprising a particular gene or to an mRNA transcript of 

such as those employing PCR, As known to those skilled in that gene and, thereby, inhibits the transcription of that gene 

the art, large probes such as 200 BP or more are preferred for and/or the translation of that mRNA. The antisense mol- 

certain uses such as Southern blots, while smaller fragments ccules are designed so as to interfere with transcription or 

will be preferred for uses such as PCR. Unique fragments translation of a target gene upon hybridization with the 

also can be used to produce fusion proteins for generating 25 target gene. Those skilled in the art will recognize that the 

antibodies as demonstrated in the Examples, or for gener- exact length of the antisense oligonucleotide and its degree 

ating immunoassay components. Likewise, unique frag- of complementarity with its target will depend upon the 

ments can be employed to produce nonfuscd fragments of specific target selected, including the sequence of the target 

the Tat-Stimulatory Factor proteins, useful, for example, in and the particular bases which comprise that sequence. It is 

immunoassays or as a competitive binding partner of the 30 preferred that the antisense oligonucleotide be constructed 

kinase which binds to the Tat-Stimulatory Factor proteins, and arranged so as to bind selectively with the target under 

for example, in therapeutic applications. Unique fragments physiological conditions, i.e., to hybridize substantially 

further can be used as antisense molecules to inhibit the more to the target sequence than to any other sequence in the 

expression of Tat-Stimulatory Factor proteins, particularly target cell under physiological conditions. Based upon SEQ 

for therapeutic purposes as described in greater detail below. 35 ID NO:l, or upon allelic or homologous genomic and/or 

As will be recognized by those skilled in the art, the size cDNA sequences, one of skill in the art can easily choose 

of the unique fragment will depend upon its conservancy in and synthesize any of a number of appropriate antisense 

the genetic code. Thus, some regions of SEQ ID NO:l and molecules for use in accordance with the present invention, 

its complement will require longer segments to be unique In order to be sufficiently selective and potent for inhibition, 

while others will require only short segments, typically 40 such antisense oligonucleotides should comprise at least 10 

between 12 and 32 BP (e.g. 12, 13, 14, 15, 16, 17, 18, 19, and, more preferably, at least 15 consecutive bases which are 

20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 and 32 bases complementary to the target. Most preferably, the antisense 

long). Virtually any segment of the region of SEQ ID NO: 1 oligonucleotides comprise a complementary sequence of 

beginning at nucleotide 53 and ending at nucleotide 2703, or 20-30 bases. Although oligonucleotides may be chosen 

its complement, that is 18 or more nucleotides in length will 45 which are antisense to any region of the gene or mRNA 

be unique except that the unique fragments herein include transcripts, in preferred embodiments the antisense oligo- 

consecutive nucleotides of SEQ ID NO:l other than those nucleotides correspond to N-tenninal or 5* upstream sites 

nucleotides which code for SEQ ID NO:3. Those skilled in such as translation initiation, transcription initiation or pro- 

the art are well versed in methods for selecting such moter sites. In addition, 3 '-untranslated regions may be 

sequences, typically on the basis of the ability of the unique 50 targeted. Targeting to mRNA splicing sites has also been 

fragment to selectively distinguish the sequence of interest used in the art but may be less preferred if alternative mRNA 

from non-Tat-Stimulatory Factor proteins. A comparison of splicing occurs. In addition, the antisense is targeted, 

the sequence of the fragment to those on known data bases preferably, to sites in which mRNA secondary structure is 

typically is all that is necessary, although in vitro confirma- not expected (see, e.g., Sainio et al., Cell MoL NeurobioL 

tory hybridization and sequencing analysis may be per- 55 14(5):439-457 (1994)) and at which proteins are not 

formed. expected to bind. Finally, although, SEQ ID NO: 1 discloses 

The invention also provides isolated, functional unique a cDNA sequence, one of ordinary skill in the art may easily 

fragments of SEQ ID NO:2. Such sequences are useful, for derive the genomic DNA corresponding to the cDNA of 

example, alone or as fusion proteins to generate antibodies, SEQ ID NO: 1. Thus, the present invention also provides for 

as a components of an immunoassay, as an inhibitor of 60 antisense oligonucleotides which are complementary to the 

Tat-Stimulatory Factor activity, as a binding partner of genomic DNA corresponding to SEQ ID NO:l. Similarly, 

Tat-Stimulatory Factor binding kinases (for example, for antisense to allelic or homologous cDNAs and genomic 

isolating such kinases) or for inhibiting binding of such DNAs are enabled without undue experimentation, 

kinases to Tat-Stimulatory Factor proteins. Such unique In one set of embodiments, the antisense oligonucleotides 

fragments can be identified by routine assays, such as those 65 of the invention may be composed of "natural" 

involving testing a fragment's ability to generate antibodies deoxyribo nucleotides, ribonucleotides, or any combination 

if injected into a proper host, testing the fragment's ability thereof. That is, the 5' end of one native nucleotide and the 
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y end of another native nucleotide may be covilcntly linked, heterologous DNA or RNA and which can be grown or 
as in natural systems, via a phosphodiester inte nucleoside maintained in culture, may be used in the practice of the 
linkage. These oligonucleotides may be prepared by art invention. Examples include bacterial cells such as E. coli 
recognized methods which may be carried out manually or and mammalian cells such as mouse, hamster, pig, goat, 
by an automated synthesizer. They also may be produced 5 primate, etc. They may be of a wide variety of tissue types, 
recombinant^ by vectors. including mast cells, fibroblasts oocytes and lymphocytes, 
. , and they may be primary cells or cell lines. Specific 
In preferred embodiments, however, the anhsense ohgo- e ^ mclude cho cells and COS cells. Cell-free tri- 
nucleotides of the invention also may include modified ofl ^ a]so ^vsri'm Ueu of ^ In gene 
ohgonucleoUdes-ThatisabeohgonuclcoUdesmaybemooi- ^cations, human hematopoietic cells that are 
fied in a number of ways which ^do .not prevent them from " pt ^^ otT ^ mcommp]MBi . 
hybridizing to the* target but which i enhance hefem a be a„ y of a number of 
or targeting or which otherwise enhance their therapeutic ........... l ■ _ j 

_ ? & r nucleic acids into which a desired sequence may be inserted 

c veness. restriction and ligation for transport between different 

The term "modified oligonucleotide" as used herein genct i c env iroruiicnts or for expression in a host ccU. Vectors 

describes an oligonucleotide in which (1) at least two of its &n typically composed of DNA although RNA vectors are 

nucleotides are covalently linked via a synthetic intemucleo- &]sQ avaiIablc Vectors include, but arc not limited to, 

side linkage (i.e., a linkage other than a phospbodiester pi^dg ^d phagemids. A cloning vector is one which is 

linkage between the 5" end of one nucleotide and the 3' end aWe to replicate m a host ^ ^ which is further charac- 

of another nucleotide) and/or (2) a chemical group not terized by one 0f mQre endonuclease restriction sites at 

normally associated with nucleic acids has been covalently which me vect0f may ^ ^ in a determinable fashion and 

attached to the oligonucleotide. Preferred synthetic inter- ^ which fl dcs[rcd DNA sequence may be ligated such that 

nucleoside linkages are phosphorothioates, the new recombinant vector retains its abihty to replicate in 

alkylpbosphonates, phosphorodithioates, phosphate esters, ^ host ^ , n ^ ^ of pUsmidS( rcp Ucation of the 

alkylphosphonothioates, phosphoramidates, carbamates, desired sequence may occur many times as the plasmid 

carbonates, phosphate triesters, acetamidates, and car- mcrcascs m ^py nurjjb cr within the host bacterium or just 

boxymethyl esters. a siQgje time per host before the host reproduces by mitosis. 

The term "modified oligonucleotide" also encompasses i n ih c case 0 f phage, replication may occur actively during 
oligonucleotides with a covalently modified base and/or a jyti c phase or passively during a lysogenic phase. An 
sugar. For example, modified oligonucleotides include oli- 3Q expression vector is one into which a desired DNAsequence 
gonucieolides having backbone sugars which are covalently ma y be inserted by restriction and ligation such that it is 
attached to low molecular weight organic groups other than operably joined to regulatory sequences and may be 
a hydroxyl group at the 3' position and other than a phos- expressed as an RNA transcript. Vectors may further contain 
phate group at the 5' position. Thus modified oligonucle- onc or morc marker sequences suitable for use in the 
otides may include a 2'-0-alkyiated ribose group. In 35 identification of cells which have or have not been trans- 
addition, modified oligonucleotides may include sugars such formed or transfected with the vector. Markers include, for 
as arabinose instead of ribose. The present invention, thus, example, genes encoding proteins which increase or 
contemplates pharmaceutical preparations containing modi- decrease either resistance or sensitivity to antibiotics or 
fied antisense molecules that are complementary to and om cr compounds, genes which encode enzymes whose 
hybridizable with, under physiological conditions, nucleic w activities are detectable by standard assays known in the art 
acids encoding Tat-Stimulatory Factor proteins or kinases ( c ,g. p-galactosidase or alkaline phosphatase), and genes 
that bind to Tat-Stimulatory Factor proteins, together with w hich visibly affect the phenotypc of transformed or trans- 
pharmaceutically acceptable carriers. f ccle d cells, colonies or plaques. Preferred vectors are those 
Antisense oligonucleotides may be administered as part of capable of autonomous replication and expression of the 
a pharmaceutical composition. Such a pharmaceutical com- 45 structural gene products present in the DNA segments to 
position may include the antisense oligonucleotides in com- which they are operably joined. 

bination with any standard physiologically and/or pharma- As used herein, a coding sequence and regulatory 

ceutically acceptable carriers which are known in the art. sequences are said to be "operably^ joined when they are 

The compositions should be sterile and contain a therapeu- covalently linked in such a way as to place the expression or 

tically effective amount of the antisense oligonucleotides in 50 transcription of the coding sequence under the influence or 

a unit of weight or volume suitable for administration to a control of the regulatory sequences. If it is desired that the 

patient. The term "pharmaceutically acceptable" means a coding sequences be translated into a functional protein, two 

non-toxic material that docs not interfere with the effective- DNA sequences are said to be operably joined if induction 

ness of the biological activity of the active ingredients. The of a promoter in the 5' regulatory sequences results in the 

term "physiologically acceptable" refers to a non -toxic 55 transcription of the coding sequence and if the nature of the 

material that is compatible with a biological system such as linkage between the two DNA sequences does not (1) result 

a cell, cell culture, tissue, or organism. The characteristics of in the introduction of a frame-shift mutation, (2) interfere 

the carrier will depend on the route of administration. with the ability of the promoter region to direct the tran- 

Physiologically and pharmaceutically acceptable carriers scription of the coding sequences, or (3) interfere with the 

include diluents, fillers, salts, buffers, stabilizers, 60 ability of the corresponding RNA transcript to be translated 

solubilizers, and other materials which arc well known in the into a protein. Thus, a promoter region would be operably 

art. joined to a coding sequence if the promoter region were 

The invention also involves expression vectors coding for capable of effecting transcription of that DNA sequence such 

Tat-Stimulatory Factor proteins and fragments and variants that the resulting transcript might be translated into the 

thereof and Tat Stimulatory Factor antisense, and host cells 65 desired protein or polypeptide. 

containing those expression vectors. Virtually any cells, The precise nature of the regulatory sequences needed for 

prokaryotic or eukaryotic, which can be transformed with gene expression may vary between species or cell types, but 
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shall in general include, as necessary, 5' do □ -transcribing 
and 5' non-translating sequences involved with the initiation 
of transcription and translation respectively, such as a TATA 
box, capping sequence, CAAT sequence, and the like. 
Especially, such 5' non-transcribing regulatory sequences 
will include a promoter region which includes a promoter 
sequence for transcriptional control of the operably joined 
gene. Regulatory sequences may also include enhancer 
sequences or upstream activator sequences as desired. The 
vectors of the invention may optionally include S leader or 
signal sequences, 5' or 3*. The choice and design of an 
appropriate vector is within the ability and discretion of one 
of ordinary skill in the art 

Expression vectors containing all the necessary elements 
for expression are commercially available and known to 
those skilled in the art. Sec Sambrook ct aL, Molecular 
Cloning: A Laboratory Manual, Second Edition, Cold 
Spring Harbor Laboratory Press, 1989. Cells are genetically 
engineered by the introduction into the cells of heterologous 
DNA(RNA) encoding the Tat-Stimulatory Factor protein or 
fragment or variant thereof. That heterologous DNA(RNA) 
is placed under operable control of transcriptional elements 
to permit the expression of the heterologous DNA in the host 
cell. 

Examples of systems for mRNA expression in mamma- 
lian cells are those such as pRc/CMV (available from 
Invitrogen, Carlsbad, Calif.) that contain a selectable marker 
such as a gene that confers G418 resistance (which facili- 
tates the selection of stably transfected cell lines) and the 
human cytomegalovirus (CMV) enhancer-promoter 
sequences. Another system suitable for expression in pri- 
mate or canine cell lines is the pCEP4 vector (Invitrogen), 
which contains an Epstein Barr virus (EBV) origin of 
replication, facilitating the maintenance of plasmid as a 
multicopy extrachromosomal element. 

A variety of systems for expression of proteins in 
bacterial, yeast, or insect cells have been described and are 
commercially available. Examples of such systems include 
the Glutathione -S-transferase (GST) Gene Fusion system 
available from Pharmacia Biotech, Piscaiaway, N.J. In this 
system a plasmid is constructed containing the protein 
sequence of interest (in this case, the transporter including 
the first extracellular domain) inserted in frame downstream 
of the 25 kDa GST domain from 5. japonicum. Expression 
of the fusion protein can be induced in transfected bacterial 
cells and the fusion protein purified by affinity chromatog- 
raphy using Glutathione Sepharose 4B. Cleavage of the 
desired peptide from the GST sequences is achieved using a 
site specific protease whose recognition sequence is located 
immediately upstream from the cloning site. An alternative 
system which is desirable since it maintains eukaryotic- 
specific functions such as glycosylation is recombination 
into baculovirus. Standard protocols exist (c.f. O'Reilly et 
al., Baculovirus Expression Vectors: A :Laboratory Manual, 
IRL/Oxford University Press, 1992) and vectors, cells, and 
reagents are commercially available. Vaccinia virus vectors 
also may be employed. 

The invention also permits the construction of Tat- 
Stimulatory Factor gene "knock-outs" in cells and in 
animals, providing materials for studying transcription and 
HIV replication. 

The invention also involves polypeptides which bind to 
Tat-Stimulatory Factor proteins, complexes of Tat- 
Stimulatory Factor proteins and their kinase binding 
partners, and to the kinase binding partners of the Tat- 
Stimulatory Factor proteins. Such binding partners can be 
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used, for example, in screening assays to detect the presence 
or absence of Tat-Stimulatory Factor proteins and their 
kinase binding partners and in purification protocols to 
isolate Tat-Stimulatory Factor proteins and their kinase 
binding partners. Such poly peptides also can be used to 
inhibit the native activity of the Tat-Stimulatory Factor 
proteins or their kinase binding partners, for example, by 
binding to such proteins, or their binding partners or both. 

The invention, therefore, involves antibodies or fragments 
of antibodies having the ability to selectively bind to Tat- 
Stimulatory Factor proteins. Antibodies include polyclonal 
and monoclonal antibodies, prepared according to conven- 
tional methodology. 

Significantly, as is well-known in the art, only a small 
portion of an antibody molecule, the paratope, is involved in 
the binding of the antibody to its epitope (see, in general, 
Clark, W. R. (1986) The Experimental Foundations of 
Modern Immunology Wiley & Sons, Inc., New York; Roitt, 
I. (1991) Essential Immunology, 7th Ed. Blackwell Scientific 
Publications, Oxford). The pFc' and Fc regions, for example, 
are effectors of the complement cascade but are not involved 
in antigen binding. An antibody from which the pFc 1 region 
has been enzymatically cleaved, or which has been produced 
without the pFc 1 region, designated an F(ab% fragment, 
retains both of the antigen binding sites of an intact antibody. 
Similarly, an antibody from which the Fc region has been 
enzymatically cleaved, or which has been produced without 
the Fc region, designated an Fab fragment, retains one of the 
antigen binding sites of an intact antibody molecule. Pro- 
ceeding further, Fab fragments consist of a covalently bound 
antibody light chain and a portion of the antibody heavy 
chain denoted Fd. The Fd fragments are the major determi- 
nant of antibody specificity (a single Fd fragment may be 
associated with up to ten different light chains without 
altering antibody specificity) and Fd fragments retain 
epitope-binding ability in isolation. 

Within the antigen-binding portion of an antibody, as is 
well-known in the art, there are complementarity determin- 
ing regions (CDRs), which directly interact with the epitope 
of the antigen, and framework regions (FRs), which main- 
tain the tertiary structure of the paratope (see, in general, 
Clark, 1986; Roitt, 1991). In both the heavy chain Fd 
fragment and the light chain of IgG immunoglobulins, there 
are four framework regions (FR1 through FR4) separated 
respectively by three complementarity determining regions 
(CDR1 through CDR3). The CDRs, and in particular the 
CDR3 regions, and more particularly the heavy chain 
CDR3, are largely responsible for antibody specificity. 

It is now well-established in the art that the non-CDR 
regions of a mammalian antibody may be replaced with 
similar regions of conspecific or hctcrospccific antibodies 
while retaining the epitopic specificity of the original anti- 
body. This is most clearly manifested in the development 
and use of "humanized" antibodies in which non-human 
CDRs are covalently joined to human FR and/or Fc/pFc' 
regions to produce a functional antibody. Thus, for example, 
PCT International Publication Number WO 92/04381 
teaches the production and use of humanized murine RSV 
antibodies in which at least a portion of the murine FR 
regions have been replaced by FR regions of human origin. 
Such antibodies, including fragments of intact antibodies 
with antigen-binding ability, are often referred to as "chi- 
meric" antibodies. 

Thus, as will be apparent to one of ordinary skill in the art, 
the present invention also provides for F(ab')2, Fab, Fv and 
Fd fragments; chimeric antibodies in which the Fc and/or FR 
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and/or CDR1 and/or CDR2 and/or light chain CDR3 regions 
have been replaced by homologous human or non-human 
sequences; chimeric F(ab')j fragment antibodies in which 
the FR and/or CDR1 and/or CDR2 and/or light chain CDR3 
regions have been replaced by homologous human or non- 5 
human sequences; chimeric Fab fragment antibodies in 
which the FR and/or CDR1 and/or CDR1 and/or light chain 
CDR3 regions have been replaced by homologous human or 
non-human sequences; and chimeric Fd fragment antibodies 
in which the FR and/or CDR1 and/or CDR2 regions have 10 
been replaced by homologous human or non-human 
sequences. The present invention also includes so-called 
single chain antibodies. 

Thus, the invention involves polypeptides of numerous 
size and type that bind specifically to Tat-Stimulatory Factor 15 
proteins, their kinase binding partners and complexes of 
both Tat-Stimulatory Factor proteins and their kinase bind- 
ing partners. These polypeptides may be derived also from 
sources other than antibody technology. For example, such 
polypeptide binding agents can be provided by degenerate 2 o 
peptide libraries which can be readily prepared in solution, 
in immobilized form or as phage display libraries. Combi- 
natorial libraries also can be synthesized of peptides con- 
taining one or more amino acids. Libraries further can be 
synthesized of peptoids and non-peptide synthetic moieties. 25 

Phage display can be particularly effective in identifying 
binding peptides useful according to the invention. Briefly, 
one prepares a phage library (using e.g. ml 3, fd, or lambda 
phage), displaying inserts from 4 to about 80 amino acid 
residues using conventional procedures. The inserts may 30 
represent, for example, a completely degenerate or biased 
array. One then can select phage -bearing inserts which bind 
to the Tat-Stimulatory Factor protein. This process can be 
repeated through several cycles of reselection of phage that 
bind to the Tat-Stimulatory Factor protein. Repeated rounds 35 
lead to enrichment of phage bearing particular sequences. 
DNA sequence analysis can be conducted to identify the 
sequences of the expressed polypeptides. The minimal linear 
portion of the sequence that binds to the Tat-Stimulatory 
Factor protein can be determined. One can repeat the pro- 40 
cedure using a biased library containing inserts containing 
part or all of the minimal linear portion plus one or more 
additional degenerate residues upstream or downstream 
thereof. Yeast 2 hybrid screening methods also may be used 
to identify polypeptides that bind to the Tat-Stimulatory 45 
Factor proteins. Thus, the Tat-Stimulatory Factor molecule 
of the invention, or a fragment thereof, can be used to screen 
peptide libraries, including phage display libraries, to iden- 
tify and select peptide binding partners of the Tat- 
Stimulatory Factor proteins of the invention. Such mol- 50 
ecules can be used, as described, for screening assays, for 
purification protocols, for interfering directly with the func- 
tioning of Tat and for other purposes that will be apparent to 
those of ordinary skill in the art. 

The Tat-Stimulatory Factor proteins also can be used to 55 
isolate their native binding partners, including the kinases 
that complex with the Tat-Stimulatory Factor proteins. Such 
isolation of kinases may be according to well-known meth- 
ods. For example, isolated Tat-Stimulatory Factor proteins 
can be attached to a substrate, and then a solution suspected 60 
of containing the kinase may be applied to the substrate. If 
the kinase binding partner for Tat-Stimulatory Factor pro- 
teins is present in the solution, then it will bind to the 
substrate-bound Tat-Stimulatory Factor protein. The kinase 
then may be isolated. The kinase also can be isolated by 65 
successive fractionation of a solution containing the kinase, 
and determining whether the kinase is present with each 
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successive phase of fractionation. Kinase activity may be 
determined from in vitro kinase reaction using Tat- 
Stimulatory Factor as the kinase substrate. 

When used therapeutically, the compounds of the inven- 
tion are administered in therapeutically effective amounts. In 
general, a therapeutically effective amount means that 
amount necessary to delay the onset of, inhibit the progres- 
sion of, or halt altogether the particular condition being 
treated. Therapeutically effective amounts specifically will 
be those which desirably influence transcriptional activity. 
When it is desired to decrease such activity, then any 
inhibition of such activity is regarded as a therapeutically 
effective amount. When it is desired to increase such 
activity, then any enhancement of such activity is regarded 
as a therapeutically effective amount. Generally, a therapeu- 
tically effective amount will vary with the subject's age, 
condition, and sex, as well as the nature and extent of the 
disease in the subject, all of which can be determined by one 
of ordinary skill in the art. The dosage may be adjusted by 
the individual physician or veterinarian, particularly in the 
event of any complication. A therapeutically effective 
amount typically varies from 0.01 mg/kg to about 1000 
mg/kg, preferably from about 0.1 mg/kg to about 200 mg/kg 
and most preferably from about 0.2 mg//kg to about 20 
mg/kg, in one or more dose administrations daily, for one or 
more days. 

The therapeutics of the invention can be administered by 
any conventional route, including injection or by gradual 
infusion over time. The administration may, for example, be 
oral, intravenous, intraperitoneal, intramuscular, intracavity, 
intrarespiratory, subcutaneous, or transdermal. 

Preparations for parenteral administration include sterile 
aqueous or non-aqueous solutions, suspensions, and emul- 
sions. Examples of non-aqueous solvents are propylene 
glycol, polyethylene glycol, vegetable oils such as olive oil, 
and injectable organic esters such as ethyl oleate. Aqueous 
carriers include water, alcoholic/aqueous solutions, emul- 
sions or suspensions, including saline and buffered media. 
Parenteral vehicles include sodium chloride solution, Ring- 
er's dextrose, dextrose and sodium chloride, lactated Ring- 
er's or fixed oils. Intravenous vehicles include fluid and 
nutrient replenisbers, electrolyte replenishers (such as those 
based on Ringer's dextrose), and the like. Preservatives and 
other additives may also be present such as, for example, 
antimicrobials, anti-oxidants, chelating agents, and inert 
gases and the like. 

The invention also contemplates gene therapy. The pro- 
cedure for performing ex vivo gene therapy is outlined in 
U.S. Pat. No. 5,399,346 and in exhibits submitted in the file 
history of that patent, all of which are publicly available 
documents. In general, it involves introduction in vitro of a 
functional copy of a gene or fragment thereof into a cell(s) 
of a subject and returning the genetically engineered ccll(s) 
to the subject The functional copy of the gene or fragment 
thereof is under operable control of regulatory elements 
which permit expression of the gene in the genetically 
engineered cell(s). Numerous transfection and transduction 
techniques as well as appropriate expression vectors are well 
known to those of ordinary skill in the art, some of which are 
described in PCT application WO95/00654. 

As an illustrative example, primary human blood cells 
which are precursors of T-cells can be obtained from the 
bone marrow of a subject who is a candidate for such gene 
therapy. Then, such cells can be genetically engineered ex 
vivo with DNA (RNA) encoding an agent that binds to a 
Tat-Stimulatory Factor nucleic acid or expression product 
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thereof. The genetically engineered cells then ire returned to 
the patient. Such recombinant cells are expected to resist 
intracellular HIV replication. 

The invention also contemplates targeting to particular 
cells the nucleic acids and proteins of the invention, includ- 5 
ing specifically antiscnse nucleic acids and agents that bind 
Tat-Stimulatory Factor proteins. Targeting may be tissue- 
specific, using targeting agents known to those of ordinary 
skill in the art. Methodologies for targeting include 
conjugates, such as those described in U.S. Pat. NO. 5,391, 10 
723 to Priest. Another example of a well-known targeting 
vehicle is liposomes. Liposomes are commercially available 
from Gibco BRL (Life Technologies, Inc., Gaitbersburg, 
Md.). Numerous methods are published for making targeted 
liposomes. A preferred cell type is a T-cell, and, in particular, 15 
a T-cell infected with HIV. 

According to another aspect of the invention, a method of 
screening for compounds which bind to Tat-SFl, Tat-SFl- 
associated kinase, or a complex of Tat-SFl and Tat-SFl- 
associaled kinase is provided. 20 

The methods disclosed herein are useful for identifying 
specific compounds or molecules which bind to Tat-SFl, 
Tat-SFl associated kinase or a complex of Tat-SFl and 
Tat-SFl associated kinase. Such compounds can have utility 
as reagents for affinity purification, for example where the 25 
compound is immobilized and used to "capture* Tat-SFl, 
Tat-SFl associated kinase or the complex of Tat-SFl and 
Tat-SFl associated kinase from a biological extract such as 
a cell or tissue homogenate. Such compounds can also be 
used for localization of Tat-SFl, Tat-SFl associated kinase 30 
or a complex of Tat-SFl associated kinase in an intact 
biological system such as a cell or a tissue. The methods 
disclosed herein are also useful for preparation of a "library" 
of high probability drug candidates. Compounds can be 
identified as binding Tat-SFl, Tat-SFl associated kinase or 35 
complex of Tat-SFl associated kinase by a change in a 
selected biological activity, such as modulation of Tat-SFl- 
mediated transcriptional activity, Tat-SFl associated kinase 
phosphorylation activity and the like. High probability drug 
candidates are those compounds which cause a change in a 40 
biological activity. Such changes in activity can provide 
potential therapeutic benefits. Compounds identified by in 
vitro assays as potential drug candidates can be tested 
subsequently in cell- or animal-based disease models to 
determine more accurately the therapeutic potential of the 45 
drug candidates. 

One screening method is an in vitro binding assay of a 
type familiar to one of ordinary skill in the art. To preform 
such an assay, a test compound is contacted with the 
Tat-SFl, Tat-SFl -associated kinase, or a complex of Tat- 50 
SF1 and Tat-SFl-associated kinase and the binding of the 
compound to the Tat-SFl, Tat-SFl-associated kinase, or a 
complex of Tat-SFl and Tat-SFl-associated kinase is deter- 
mined. The binding can be determined in a number of ways. 
For example, binding of a labeled compound can be detected 55 
by methods well known to those of ordinary skill in the art. 
Binding of a compound also can be determined by a com- 
petitive binding assay, such as by a reduction in binding of 
an antibody to the Tat-SFl, Tat-SFl-associated kinase, or a 
complex of Tat-SFl and Tat-SFl-associated kinase or by the 60 
inhibition of binding between Tat-SFl and Tat-SFl associ- 
ated kinase. Binding of a compound also can be determined 
by a change in electrophoretic mobility or chromatographic 
elution profile of Tat-SFl, Tat-SFl-associated kinase, or a 
complex of Tat-SFl and Tat-SFl-associated kinase relative 65 
to the profile of such a polypeptide (or complex) not bound 
by the compound. 



Preferably, the compound is an oligonucleotide. Oligo- 
nucleotides useful in the invention can be prepared accord- 
ing to standard methods in the art. Oligonucleotides which 
bind to Tat-SFl, Tat-SFl -associated kinase, or a complex of 
Tat-SFl and Tat-SFl-associated kinase are preferably pre- 
pared by binding and screening for binding activity, fol- 
lowed by random or targeted mutation of the nucleotides 
which constitute the oligonucleotide in an iterative fashion. 
Commercially available libraries can be screened or, as 
would be more likely, can be prepared for screening. Prepa- 
ration of peptide libraries are described herein. Lam {Nature 
354:82-54, 1991) also describes combinatorial methods for 
creating libraries of synthesized peptides on polystyrene 
beads, each bead carrying only one peptide. Similar proce- 
dures are known for making libraries of oligonucleotides. 
Methods for the selection of several classes of molecules 
(including oligonucleotides, peptides, and RNAs) also are 
described in Abc]san, Science 249:488-489, 1990; Ellington 
et al., Nature 346:818-822, 1990; Tuerk et al., Science 
249:505-510, 1990; Irvine et al., J. Mol. Biol 222:739-761, 
1991; Bock et al., Nature 355:564-566, 1992; Ellington et 
al., Nature 355:850-852, 1992; Gallop et al.,J. Med. Chem, 
37:1233-1251, 1994; Gordon et al., J. Med Chem. 
37:1385-1401, 1994; and Gold, J. Biol. Chem. 
270:13581-13584, 1995. 

In one embodiment of the assay, the compound is detect- 
ably labeled. The binding of the labeled compound bound to 
Tat-SFl, Tat^SF 1-associated kinase, or a complex of Tat- 
SFl and Tat-SFl-associated kinase then can be determined 
by any method known to one of skill in the art The 
particular method chosen to detect the labeled compound 
will depend on the nature of the label. For instance, a 
radioactively labeled compound can be detected by scintil- 
lation counting, autoradiography, and phosphorimaging. 
Fluorescently labeled compounds can be detected by fluo- 
rometry. Other detectable labels are known in the art, along 
with suitable detection methods. 

The compound also can be labeled with a molecule which 
serves as a binding point for a detectable label. For example, 
a compound can be labeled with a biotin molecule which 
serves as an binding point for a streptavidin molecule which 
is detectably labeled. 

Other preferred methods for determining the binding of a 
compound include detecting a change in a biological activity 
of the Tat-SFl, Tat-SFl-associated kinase, or a complex of 
Tat-SFl and Tat-SFl-associated kinase. Determinable bio- 
logical activities include Tat-SFl mediated transcription 
from TAR elements, changes in protein conformation and/or 
protein-protein interaction which are detectable in an 
immunoassay, e.g. by antibody binding, Tat-SFl-associated 
kinase substrate phosphorylation, and binding of Tat-SFl to 
a TAR element. For example, Tat-SFl mediated transcrip- 
tion can be determined by a reconstituted transcription assay 
in which Tat-SFl is combined with transcription factors 
required to initialed and elongate transcription of a RNA 
containing a TAR element. The change in transcription 
mediated by Tat-SFl which is caused by binding of a 
compound can be readily determined by comparing the 
transcription in the presence and in the absence of the 
compound. Tat-SFl-associated kinase substrate phosphory- 
lation can be determined by inclusion of radiolabeled ATP in 
a phosphorylation reaction and determined by observing the 
difference in the radiotabel incorporated into a substrate of 
the Tat-SFl-associated kinase. Modulation of binding of 
Tat-SFl to a TAR element by binding of a compound to 
Tat-SFl, Tat-SFl-associated kinase, or a complex of Tat- 
SFl and Tat-SFl-associated kinase can be determined by a 
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well-known assay such as an electrophoretic mobility shift link the TAR element to the nucleic acid using standard 
assay (EMSA). A change in electropboretic mobility molecular biology techniques, and test for a change in 
observed upon contacting the compound with Tat-SFl, transcription in nucleic acid by staodard techniques, e.g., 
Tat-SFl -associated kinase, or a complex of Tat-SFl and quantitantive nucleic acid hybridization, polymerase chain 
Tat-SFl -associated kinase will reflect a change in the bind- 5 reaction amplification, nuclease protection assay and the 
ing to a TAR element Changes in protein conformation like. Alternatively, one of ordinary skill in the art can use a 
and/or protein-protein interaction are detectable by immu- nucleic acid which contains a TAR element in the foregoing 
noassays using antibodies which recognize a particular assay to determine the effect of a test compound on Tat- 
protein conformation or protein-protein interaction such as SF1 -mediated transcriptional activity. For example, HIV-1 
the interaction of Tat-SFl and Tat-SFl -associated kinase. A sequences containing a TAR element can be used in such an 
change in antibody binding subsequent to contacting Tat- assay, such as a reporter construct containing HIV-1 LTR 
SF1, Tat-SFl-associated kinase and/or a complex of the two linked to the bacterial CAT gene. Preferably an indicator 
with a compound is readily determined using standard gene is transcribed and the amount of indicator gene product 
immunoassay techniques. Immunoassays which measure fc determined in the presence and the absence of the corn- 
disruption of antibody binding to a particular epitope of , pound. 

Tat-SFl , Tat-SFl-associated kinase and/or a complex of the lo preferred embodiments, the mammalian cell used in the 

two are useful for determining binding of a compound to that f^y of Tat-SFl -mediated transcriptional activation modu- 

particular epitope. One of ordinary skill in the art will ^contains a gene encoding a Tat polypep- 

L „• .u . . . „r „ ■, ,■ , . • ■ „ tide and/or an indicator gene operably linked to a TAR 

recognize that a panel of antibodies which re^gnrze a clcmeQt ^ ^^.^^ ^ 

variety -of epitopes wiU enable determination of *cb|ndmg 2Q Ude ^ ^ ifl fl vari Qf 

site of a compound which binds to Tat-SFl Tat-SFl- expressioD vectors, including plasmid- or virus- 

associated kinase and/or a complex of the two. Biological based episomal wctots ^ veclors which mtegrate ml0 lhe 

activities and methods of determining a change in the hos( ^ chromosomes, as is known to those skilled in the 

activities subsequent to binding of a compound to Tat-SFl, &n Th e gcncs described above can be introduced into the 

Tat-SFl-associated kinase, or a complex of Tat-SFl and 2S mam malian cell by standard procedures, or can be resident 

Tat-SFl-associated kinase, are described more fully in the in mc cc H f sucn a s encoded by the cell's chromosomal 

Examples below. nucleic acid. Standard procedures for introducing nucleic 

It is not intended that the foregoing represents an exhaus- acids into a mammalian cell include, but are not limited to, 

tive listing of methods useful for determining the binding of physical means such as transfection, electroporation and 

a compound to Tat-SFl, Tat-SFl-associated kinase, or a 30 bombardment with nucleic acid-coated microparticles, and 

complex of Tat-SFl and Tat-SFl-associated kinase. Addi- biological means such as receptor-mediated endocytosis 

tional detectable labels, or biological activities of Tat-SFl, "sing targeted nucleic acids or nucleic acids contained in 

Tat-SFl-associated kinase, or a complex of Tat-SFl and targeted liposomes and viral infection. The particular vectors 

Tat-SFl-associated kinase will be apparent to one of ordi- and /° r means of introducing genes may depend on the 

narv skill in the art mammalian cell chosen for the assay. The cells and cell lines 

, r. - - t_ j c 35 useful in such assays include HeLa cells, COS cells and the 

According to another aspect of the invention, a method of * 

screeningfor compounds wUch moddateTat jFl-mediated Qf of tte f ^ ^ 
transcriptional actrv, y is provided. The method involves (1) of the ^ d/or eQCode 6 on i* a & part of the polypeptide 
providing a mammalian 1 cell containing a gene encoding a wherc ^ genc produc( to bc dctcctcd ^ a oudcic acid QJ 
Tat-SFl polypeptide; (2) contacting the mammalian cell 40 where the portioo of the polypeptide encoded by the gene 
with one or more compounds, preferably under conditions to retains the acuvity 0 f me wno i e polypeptide or an epitope 
induce Tat-SFl mediated transcription; and (3) determining bound by an antibody. For example, an indicator gene need 
the Tat-SFl mediated transcriptional activation. 0 nly encode a RNA gene product that is detectable by 
Cell-based screening methods are provided herein for hybridization in an assay such as RNase protection, oligo- 
determining the Tat-SFl -mediated transcriptional activity as nucleotide hybridization, or reveise transcriptase poly- 
modulating potential of compounds. These methods are merase chain reaction (RT-PCR). Where a polypeptide gene 
useful for identifying compounds which can modulate Tat- product is to be detected, a fragment which is sufficient to 
SFl-mediated transcriptional activity in the presence of retain a detectable activity in an immunoassay, or a enzy- 
cellular proteins and other factors involved in the transcrip- matic activity assay, would be sufficient to meet the criteria 
tion of genes in a cell. Thus, such methods permit screening 50 disclosed above. 

o f 00 mpounds in a variety of cells which may be particularly As used herein, modulation of Tat-SFl -mediated tran- 
relevant to a disease state. For example, to identify com- scriptional activation refers to the ability of a compound to 
pounds which are useful for reducing Tat-SFl -mediated modulate Tat transcriptional activation, mediated by Tat- 
transcriptional activity as a means of reducing HIV-1 SF1, from a TAR RNA element. Thus, a Tal-SFl-mediated 
infection, one of ordinary skill in the art can select an 55 transcriptional activation modulating activity refers to the 
appropriate cell within the host range of HI V-l, such as a T ability of a compound to reduce or increase the ability of 
cell, to perform the screening assay in. Tat-SFl to stimulate Tat-dependent transcriptional elonga- 
The skilled artisan can readily determine the effect of a tion of RNA molecules containing a TAR element. For 
test compound on Tat-SFl -mediated transcriptional activa- example, compounds which have Tat-SFl-medialed trac- 
tion in a cell-based assay. It is known that the presence of 60 scriptional activation modulating activity, include com- 
Tat-SFl in a cell stimulates transcription of genes which are pounds which disrupt the binding of Tat-SFl to a TAR 
operably linked to a TAR sequence, in particular by increas- element, compounds which disrupt the binding of Tat-SFl to 
ing elongation of nascent transcripts. Thus detection of a a Tat-SFl-associated kinase, compounds which disrupt the 
change in Tat-SFl mediated-transcriptional activation phosphorylation of Tat-SFl by a Tat-SFl-associated kinase, 
involves detecting increased (or decreased) transcription of 65 compound which disrupt the recruitment of additional trail - 
a TAR-Iinked nucleic acid. The skilled artisan can choose a scription factors to the RNA containing a TAR element, and 
nucleic acid sequence to link to a TAR element, operably the like. 
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Tat and Tat-SFl proteins can bind to a TAR element in a dependent activation of HIV transcription (Zhou and Sharp, 

promoter which controls the transcription of an indicator EMBO J. 14: 321-328, 1995; FIG. 1A). This reaction 

gene. The indicator gene product, whether nucleic acid or requires a Tat-SF fTat -Stimulatory Factor) activity that is 

protein, provides a readily detectable output for screening specific for Tat stimulation of elongation. It also requires a 

compounds for Tat-SF 1-mediated transcriptional activation 5 phosphocellulose 0.5-1 .0 M KOAc fraction of He La nuclear 

modulating activity. Ao indicator gene permits high through- extract, termed the pc-D fraction, and the purified basal 

put screening of a number of compounds at one time. factors TFIID, TF1IA, and transcription factor Spl. Reac- 

Depeoding on the choice of indicator gene, and its gene tions containing these components, but lacking Tat-SF activ- 

product, automation of the methods disclosed herein is also ity supported activation by Spl and GAL4-VP16 (Zhou and 

contemplated. J0 Sharp 1995 ) ) bul DOt ^ Tat x md 2 , FIG. 1A). The 

Preferred indicator genes encode an indicator gene prod- transcription reactions were performed as follows. Recon- 

uct which is easily detectable with or without disruption of stitulcd transcri p lioo re actions containing both templates 

the mammalian cell Gene products such as RN A transcripts pH lV+TAR-G400 and pHIVTAR-GlOO were performed in 

of the gene car ibe detected by bybndization assays or any ^ abscQcc ( _ } Qr ^ (+) q{ ^ Q 5 M KQ 

other assay which detects nucleic acids of a specific _ „ . \' . , • • t . or 

sequence. Preferably, hybridization assays are conducted 15 ^P^"** ^romatographic fraction containing Tat-SF 

under stringent conditions as defined above. Indicator gene !^* n ~ ^/"TlS^fJSS ^ , ^ 

products which are proteins can be detected by any tech- Thc P*>? bz ^ oa Md P""?" 1 m ™> and SP 1 ™" 5 

niquc suitable for detection of proteins. For example, an P rcscDt ,n M reactions. G-less cassettes of two different 

indicator gene can encode an indicator gene product to lcD S ths wcrc inserted into the above two templates at 

which antibodies have been raised. Such antibodies which 20 position +955 downstream of the HIV-1 initiation site to 

selectively bind to the indicator gene product can be measure the effect of Tat on transcriptional elongation, 

employed in immunoassays to determine the amount of Transcripts derived from these two templates were digested 

protein produced as a result of increased Tat-responsive by RNase Tl and the resulting 400- and 100-nucleotide 

transcription in the presence and absence of a compound. G-less RNA fragments were separated in a denaturing 

Antibodies useful in the detection of indicator geae products 25 polyacrylanude gel, and their positions were indicated by 

include commercially available antibodies such as antibod- the arrows. In the presence of a partially purified Tat-SF 

ies to green fluorescent protein (Clontech, Palo Alto, Calif.), fraction, Tat specifically increased the number of transcripts 

E. coli bacterial alkaline phosphatase, p-galactosidase elongating beyond 1000 nucleotides from a HIV-1 promoter 

(Boehringer Ma nnh eim, Indianapolis, Ind.), and hiciferase containing the wild-type TAR element (pHIV+TAR-G400), 

(Cortex Biochemicals, San Leandro, Calif.). 30 but not from an internal control promoter with a mutant TAR 

The indicator gene product can be a protein which has an (pHIVTAR-GlOO, compare lanes 3 and 4). 

enzymatic activity that can be assayed. Methods for deter- The pc-D fraction was shown by Western blotting to 

mining the amount of such enzymes by colorimetric means contain the basal transcription factors TFIIB, HE, IIF, IIH, 

(for example, conversion of X-gal into a blue product by and RNA polymerase II (Zhou and Sharp, 1995). This 

p-galactosidase) or radioactive means (for example, addition 35 fraction probably also contains other activities important for 

of a ^-acetyl or 14 C-acetyl group to a chloramphenicol Tat function, because it can not be substituted with highly 

molecule by chloramphenicol acetyl transferase) will be purified basal transcription factors. Using the reconstituted 

known to one of ordinary skill in the art. Other indicator reaction detailed above to follow the activity, Tat-SF was 

gene products, such as green fluorescent protein, are directly further purified by several chromatographic steps. HeLa 

detectable by colorimetric means. 40 nuclear extract in buffer D/0.1 M KC1 (20 mM Hepes-KOH, 

Preferably, thc indicator gene is selected from the group pH 7.9/20% (vol/vol) glycerol/0.1 M KC1/0.2 mM EDTA/ 

consisting of the genes encoding the following proteins: . 0.5 mM dithiothreitol/0.5 mM PMSF) was loaded on a 

0-galactosidasc, alkaline phosphatase, chloramphenicol phosphocellulose column preequilibrated with the same 

acetyl transferase, luciferasc, and green fluorescent protein. buffer. The flowthrough was loaded on a DEAE-Seph arose 

However, the skilled artisan may select any indicator gene «5 FF (Pharmacia, Piscataway, N.J.) matrix column preequili- 

for use in accordance with the methods of the invention brated with buffer D/0.1 M KC1. After washing the column 

provided that thc indicator gene product is detectable. with the same buffer, Tat-SF activity was eluted from the 

The indicator gene is operabty linked to a promoter column with buffer D/0.3 M KC1. This fraction wasdialyzed 

containing at least one TAR element which drives expres- buffer D/0.1 M KC1 and applied to a Q-Sepharose FF 

sion of the indica tor gene by serving as the locus for binding 50 (Pharmacia) matrix column preequilibrated with the same 

of Tat, Tat-SFl, associated proteins and a RNA polymerase. buffer Th* column was washed with buffer D/0.1 M KC1 

The nucleic acid molecules introduced into the mamma- aod «» *"M d P rcteins were eluted with a 0 lJ °- 7 M *C1 

ban cell may be contained on a plasmid or other extrach- S radieQt made m buffer D. Fracttons were analyzed for 

romosomal nucleic acid, or may be incorporated into the Tat-SF activity in reconstituted transcription assays and for 

cell's chromosomes. Non-chromosomal nucleic acids 5S PPlJO in kinase reactions (as described below). The 0.4-0.5 

include plasmids, phagemids, bacteriophage genomes, virus M KC1 Q^P^^ fraction containing Tat-SF activity and 

genomes, and the like. Non-chromosomal nucleic acids ppl40 was dialyzed against buffer D/0. 1 M KC1 and applied 

useful for preparation of expression vectors are well known to a He P arin Sepharose column. After washing the column 

in the art and thus are not described further here. extensively with buffer D/0.1 M KC1, Tat-SF/ppl40 was 

60 eluted with increasing salt concentrations and was found 

EXAMPLES mostly in 0.2-0.4 M KC1 fractions. These fractions were 

Example 1 combined, dialyzed to 0.1 M KC1, and loaded on a Glu- 
tathione Sepharose (Pharmacia) column containing GST-Tat 

Tat Activation of HIV-1 Transcription Requires A proteins> Mut washing ^ buffer mA M Ka 

Specific Cellular Activity, Tat-SF 6S Tat ^F/ppl40 was eluted from the column with buffer D 

We have previously established a partially reconstituted containing 1.4 M KC1. The estimated overall purification 

transcription reaction that supports a Tat-specific and TAR- after these steps was -3000-fold. 
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Example 2 Heparin Sepharose fraction (load , lanes 3 and 4), prep ared as 

described above, containing Tat-SF activity was loaded onto 

Detection of a Complex Containing Tat, a Cellular a 12-35% glycerol gradient and subjected to ultracentrifu- 

Kinase and a 140 kD Phosphoprotein in the galion. The first 10 of a total 16 fractions were tested for 

Transcription Reaction 5 Tat-SF activity in the reconstituted transcription reactions in 

the absence (-) or presence (+) of Tat as in FIG. 1A. Control 

Phosphorylation of RNA polymerase II has been impli- reactions (lanes 1 and 2) did not contain Tat-SF fraction, 

cated in regulation of the processivity of elongation Protein molecular weight markers were sedimented in a 

(O'Brien et al., Nature 370:75-77, 1994; Dahmus, Biochim parallel gradient and analyzed by silver staining. The peak of 

BiophysActa 1261:171-182, 1995). To investigate whether Tat-SF activity that supported Tat trans-activation was in 

protein phosphorylation might be associated with Tat-SF, an fractions #4 and #3, corresponding to a native molecular 

immobilized HIV TAR RNA was used to collect bound mass of approximately 100 kD. When the same fractions 

proteins from a reconstituted transcription reaction in the were analyzed in the kinase assay for the presence of ppl40 

presence of y- 3Z P ATP. Referring to FIG. IB, biotinylated in a kinase assay as described in FIG. IB (see FIG. 2B), 

TAR RNA (nucleotides +1 to +82) immobilized on the phosphorylation ofppl40 was also most evident in fractions 

Paramagnetic beads was introduced into the kinase reactions 15 #4 ™* #3 > "> agreement with the transcription results, 

containing v- 32 P ATP (10 mCi) and either the pc-D fraction Example 3 

alone (lanes 3 and 4), or the Tat-SF fraction alone (lanes 5 e i^n n -r . a*k . 

j ^ i -i_ r! j t . c-r c „ . ,u i Purification of ppl40 By Tat Affinity Column 

and 6), or both pc-D and Tat-SF factions together (lanes 1, Chromatography 

2, and 7). Recombinant wild type Tat protein (13 ng) was ** v 7 

included in the reactions shown in lanes 2, 4, and 6. Tat 20 As noted above, detection of the phosphorylated PP 140 on 

mutant TatK41 A (13 ng) was present in lane 7 (Rice and JAR requires the presence of the pc-D fraction, the Tat-SF 

Carlotti, J. Virol 64:1864-1868, 1990). After incubation for frac » on : " d Tat - Sequcmid mcubations of these compo- 

10 min at 30' C, the TAR RNA beads were washed neots with an immobilized TAR revealed a stable and direct 

... inA ... „™ • a i gl interaction between Tat and a cellular kinase and between 

l™T el l m u bU 5f^ containing 100 mM KC1 and 0J% ^ ^ 1A0 ^ columns containing immobilized 

NP40 and the bound proteins were analyzed by SDS- ^ were ^ d w ^ity-purify Tal-SF/ppl40. The 0.2-0.4 

PAGE. In reaction containing either the pc-D fraction alone M m heparin Sepharose fraction (load) containing Tat-SF 

(lanes 3 and 4, FIG. IB) or the Tat-SF fraction alone (lanes activity dcscribcd m Example 1 was subjected to fraction- 

5 and 6), addition of Tat did not consistently affect the atioQ mrough an Affi-Gel 10 matrix column (Bio-Rad, 

phosphorylation of proteins on immobilized TAR. When 30 Hercules, Calif.) containing immobilized Tat. Tat-SF activ- 

both fractions were incubated together in the presence of Tat, j tv was e luted from the column with increasing salt concen- 

phosphorylation of a protein of approximately 140 kD, trations. The 0.6 M KC1 fraction was analyzed in FIG. 3. 

termed ppl40, was observed (FIG. IB, lane 2). In the Fractions eluted from either a GST-Tat column or a Tat 

absence of Tat, however, only a small amount of pbospho- affinity column were analyzed in reconstituted transcription 

rylated ppl40 was detected (lanes 1). Thus, collection of a assays for the presence of Tat-SF activity (FIG. 3A). Both 

phosphorylated ppl40 on TAR required the presence of the fractions were enriched in Tat-SF activity which supported 

pc-D fraction, the Tat-SF fraction, and TaL This suggested a Tal-specific and TAR-dependenl activation. The same two 

the existence of a complex on TAR that consists of Tat, a fractions were tested in a kinase assay as described in 

cellular kinase probably derived from the pc-D fraction, and Example 1 and were found to contain ppl40 (FIG. 3B). 

the kinase substrate ppl40 from the Tat-SF fraction (see When analyzed by silver staining, the polypeptide profiles of 

u c i ow \ 40 these two fractions were different overall, with the only 

j . , common band having a mobility of 140 kD (FIG. 3C). This 

An intact Tat activation i domain was necessary for PP 140 , tidc was ■ ^ to ^ 140 md bab1 a mm . 

phosphorylation on TAR. When a non-functional Tat mutant ponent of Tat-SF activity 

(K41A; Rice and Carlotti, 1990), which has the lysine at ^ m ^ lypeptidc was rccovcrcd from thc SD S 
position 41 substituted by alanine, was present in the kinase 4S po iy acry t amide ge l by blotting onto a nitrocellulose mem- 
reaction (FIG. IB, lane 7), the amount of phosphorylated braDe Approximately 15 //g ppl40 was recovered from the 
ppl40 collected on the immobilized TAR was significantly membrane and subjected to digestion with lys-C. Six major 
reduced (compare lanes 2 and 7). Importantly, this was not peptides were obtained and microsequenced. Sequence 
due to a decreased ability of K41A to interact with TAR, analysis of six peptides indicated that ppl40 was a novel 
since K41A bound to TAR as efficiently as wild type Tat in 50 protein. However, one of the peptides 
a gel mobility-shift assay. (KMNAQETATGMAFEEPIDE, SEQ ID NO:3) was con- 
Similar results were also obtained with another Tat mutant tained in the sequence of an unidentified "expressed 
(TatC), which lacks the cysteine-rich activation domain sequence tag" (EST) EST60354 in the Washington 
(amino acids 22 to 37; Rice and Carlotti, 1990) and is University/Merck EST database. A 103 amino acid protein 
completely defective for transcriptional activity (FIG. 1C). 55 fragment (FIG. 5A, amino acids 387 to 489) encoded by the 
Kinase reactions containing an immobilized TAR were corresponding EST clone was expressed as a GST fusion 
prepared as above. Recombinant wild type Tat protein (13 and used to immunize rabbits for the production of poly- 
ng) or Tat mutant (TatC, 13 ng) lacking the cysteine-rich clonal antisera. By Western blotting, thc affinity-purified 
domain (amino acid 22-37) was included in the reactions as antibody specifically recognized a 140 kD protein present in 
indicated No Tat was present in the control reaction (lane 1). 60 both HeLa nuclear extracts and a partially purified Tat-SF 

The polypeptide ppl40 co-purified and co-titrated with fraction (FIG. 4C). 

Tat-SF transcriptional activity during purification through Example 4 
multiple chromatography steps. For example, when partially 

purified Tat-SF was sedimented through a glycerol gradient PP 140 80(1 » Ce^Uular Kinase Form a Complex 

and the first 10 fractions were analyzed in the reconstituted 65 Independently of Tal 

transcription assay for the presence of Tat-SF activity (FIG. The Tat-SF specific antibody was used to test the rela- 

2 A), Tat-SF activity and ppl40 co-peaked. A 0.2-0.4 M KC1 tionship between the EST clone and pi 40. The Tat-SF 
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fraction was subjected to immunodepletion with the affinity (2x), and the undepleted Tat-SF fraction (lanes 3 and 4) were 

purified antibody and then incubated together with the pc-D tested in transcription reactions for Tat-SF activity as 

fraction and TaL The polypeptide ppl40 was immunode- described above. No Tat-SF fraction was present in the 

pleted from a 0.4-0.5 M KC1 Q-Sepharose fraction (Tat-SF control reactions (lanes 1 and 2). Control reactions without 

fraction) through incubation on ice twice for 1.5 hr each with 5 Tat-SF did not support Tat activation (FIG. 4B, lanes 1 and 

the affinity purified anti-ppl40 antibody immobilized on 2). Inclusion of the Tat-SF fraction resulted in a TAR- 

protein A Sepharose beads Referring to FIG. 4, the depleted dependent activation by Tat as expected (lanes 3 and 4). As 

(lanes5 and 6) or undepleted Tat-SF fractions (no depl lanes £ «ntml Tat-SF fracUon (lanes 3 and 4). deple- 

\ a\ . I *V. n f„ rt \ B L ii,. ,hc«« tion of the Tat-SF fraction with specific anti-ppl40 antibody 

1-4) were incubated with the pc-D fraction in the absence Qa {q a ^pharose matrix either once 

(-) or presence ( + ) of Tat and the reactions _were subjected 10 ? ^ g) ^ r ^ twicc ^ n ud 12) 

to immunoprecipitation with the immobilized anti-ppl40 sigllificant i y rcduced i(s ability t0 supp oit Tat activation. In 

antibody (lanes 3-6). Preunmune antibody was used in contrast , depletion of the Tat-SF fraction with preimmune 

control precipitations (lanes 1 and 2). An unfrachonated an tibody either once (lanes 5 and 6) or twice (lanes 9 and 10) 

HeLa nuclear extract (NE) with (+) or without (-) the did not significanUy reduce Tat activation. Similarly, deple- 

addition of Tat was also subjected to immunoprecipitation 15 ^ on Q f fac Tat-SF fraction with an unrelated antibody 

with the specific antibody (lanes 7 and 8). After extensive ( aDU -HA mAb 12CA5) had no effect Tat activation, 

washes with buffer D containing 100 mM KC1, 0.1% NP-40, ^depleted Tat-SF fraction and Tat-SF fraction twice- 

and 10 mM MgCl 2 , the immune-complex bound to protein depleted with the specific or preimmune antibody were 

A Sepharose beads was analyzed in a kinase reaction in the subjected to electrophoresis and Western blotting with the 

presence of y 32 P ATP for 10 min at 30° G, washed with 20 anti-ppl40 antisera. As expected, a Western blot (FIG. 4C) 

buffer D, and analyzed by SDS-PAGE (FIG. 4A, lane 6). In indicates that depletion with the anti-ppl40 antibody effi- 

contrast to the control undepleted Tat-SF fraction (lane 4), ciently removed ppl40 from Tat-SF fraction. Taken 

the fraction depleted with the specific antibody did not together, these experiments strongly argue that ppl40 is 

contain the phosphorylated ppl40 (compare lanes 4 and 6). indeed necessary for Tat-SF transcriptional activity. 

Therefore, the 140 kD protein recovered from the SDS gel 25 p i « 

and represented by the EST clone was indeed ppl40, the example o 

kinase substrate. Isolation of the cDNA Encoding ppl40 

These reactions also suggest that the polypeptide ppl40 CmDI r . 32 v . , , , nMA „ . ma . 

^^^.^^u^^iytfikL ^r^/rsv c p ^r^ A n d p LT t o n, ^: 

When the Tat-SF fraction and the pc-D fraction were incu- 30 . - , . J 

. 4 t . iU u, _ * T , f„n„,„j k„ COOH-lennmus of the Tat-SFl gene and its 3 untranslated 
bated together in the absence of Ta t_ ^^^^ Kgioa was labc Ied and used as aerobe to screen a XZiplox 
precipitation jvith the anti-ppl40 antibody j>pl40 wasphos- » . QN ^ ^ fr fc ^ 
phorylated by its associated kinase when the elated £ J Borrow, MIT, Cambridge, Mass.). 
immune-complex was assayed in the kinase reaction (FIG. nTr.^ ' .' . " f v ». , " < ™7 i 
4A, lane 3). This result indicates that ppl40 forms a complex 35 i° NA f We ' C 'T^r f T T* . mde |f ° deQt " 
whh its kinase in the absence of TaL Furthermore, the * e ^utonomously-rephcaUr^ plasmid pZLl usmg the pro- 
addition of Tat to the initial incubation did not change the toco1 P rov,ded ^ manufacturer (Gibco BRL). Inserts 
. , , . . ... ,, n/ „ , 1 j a\ from the seven independent plaques had similar restriction 
level of phosphorylation on PP 140 (compare lanes 3 and 4). cndonudeasc deava £ patler ^ q and sequencing confirmed 

A preformed complex containing ppl40 and its kinase mit ^ 00^^ overlapping segments. The largest cDNA 

could be isolated by immunoprecipitation and detected in a 40 cloQC cootammg the full length TaUSFl gene was named 

kinase reaction from an unfractionated HeLa nuclear ^extract pZ L-Tat-SFl-4b and was sequenced by dideoxy-DNA 

in the absence of Tat (FIG. 4A, compare lanes 7 and 8). This sequencing with T7 DNA polymerase. The largest cDNA 

complex was stable under transcription conditions (less than fragment was 2.8-kb in length and contained a 2271 -bp 

0.1 M KC1), but dissociated m washes of greater than 0.25 rcadiog frame Tbcn were multiple m .f rame stop 

M KC1, and probably dissociated durmg fractionation in the ^ codoQS ^ upstream downstream of this coding 

purification of Tat-SF. These observations suggest that Tat is rcgioo Surprisingly, the open reading frame encoded a 

not required for the phosphorylation of pp 140 by its asso- protcin of ?54 acids ^ a caJcmatcd mo l C cular 

ciated kinase, but is required for the a^aaUon of the wei hl of 85t767 daItons (hg. 5A), which was significantly 

phosphorylated p P 140 and the kinase with TAR (FIG. IB). , ess tnan ^ apparent mo lecuIar weight of 140 kD calcu- 

Thus, Tat probably recruits a preformed complex containing lated from tbe mobility m aQ S DS polyacrylamide gel. 

p P 140 and a kinase to the HIV promoter region durmg This cDNA was judged to encode the authentic full length 

a riser ip on. ppl40 based on several observations. First, transfection of 

p , , this cDNA into human 293T cells (Pear et al., Proc Natl. 

example 3 Aajd ScL U$A oo : 8392-8396, 1993) resulted in the pro- 

ppl40 is Required for Tat-SF Transcriptional 55 ducti f ° ( f . me ™ f 60 ^ .Pf 1 . 40 P 0 1 !yP ^ pti i C J nd SST ' 

rr n . . r significant increase in the total cellular level of ppl40 as 

^ ^ judged by Western blotting. Second, all six peptide 

To test whether ppl40 is indeed necessary for Tat sequences obtained from partial sequencing of pp!40 were 

activation, the anti-ppl40 antibody was used to immuno- found in the predicted coding region (underlined in FIG. 5). 

deplete ppl40 from a partially purified fraction containing Third, Northern analysis of poly(A)" RNA isolated from 

Tat-SF activity and the depleted fraction was then tested in 60 several different types of human cells detected a single 3.0 

reconstituted transcription reactions for its ability to support kb species, a length consistent with that of the cDNA 

Tat activation (FIG. 4B). A 0.4-03 M KCl Q-Sepharose segment and adequate to encode a polypeptide of 86 kD. 

fraction containing Tat-SF activity was subjected to immu- Finally, this cDN A and two additional cDNA clones isolated 

nodepletion with preimmune antibody (lanes 5, 6,9, and 10) from a completely different cDNA library had identical 

or specific anti-ppl40 antibody (lanes 7, 8, 11, and 12) 65 upstream in-frame stop codons. 

immobilized on protein A Sepharose beads as in Example 4. Sequence analysis of the protein, referred to as Tat-SFl, 

Tat-SF fraction subjected to depletion once (lx) or twice is shown in FIG. 5A. Ghitamate (E) and aspartate (D) 
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residues present in the COOH-terminal half of Tat-SFl 
(amino acids 420 to 754) arc shown in bold face type. The 
two RNA recognition motifs (RRMs) in the NH^terminal 
half of Tat-SFl are boxed, with the conserved RNP1 and 
RNP2 motifs shown in shaded area and bold face type, 
respectively. The six peptides of Tat-SFl that were generated 
by digestion with lys-C and subjected to microsequencing 
are underlined. The regions of Tat-SFl that are homologous 
to human EWS are underlined with broken lines. 



Tat-SFl and a reporter construct containing HIV-1 LTR 
linked to the bacterial CAT gene were introduced into He La 
cells either in the presence or absence of a oo trans feet ed 
plasmid expressing Tat (Table I and FIG. 6). As a control, the 
effect of Tat-SFl overexpression on transcriptional activa- 
tion by the acidic activation domain VP16 in the TFEB- 
VP16 and GAL4- VP16 fusion proteins was assayed (Harper 
et al., Proc Natl Acad. ScL USA 93:8536-^8540, 1996). The 
Tat-SFl gene was subcloned into the mammalian expressing 
The sequence analysis of the protein revealed that it has 10 vector pS V 7d (Truett et al., DNA 4:333-349, 1985) to create 
several unique features. The protein can be roughly divided p sv-Tat-SFl. pSV-Tat-SFl or vector pSV7d and a reporter 
at position 420 into two halves. The COOH-termmal half p BennCAT (Gendelman el al., Proc. Nail Acad. 

was extremely rich in acidic ammo adds, with 48% of the J(t ^ g^^^ 1986) containing HIV-1 LTR linked 
last 245 ammo acid residues as glutamate or aspartate The CAr (1 each) £ an imerml 

unusual aadic narurc of ^^^f^™^**?. is plasmid pCMVB-Gal were 2 T-transfected into HeLa cells, 

its aberrant mobility in an SDS gel. The COOH-termmal *\ -ft. u <• -r . i -J 

half also contained many serine residues that are arranged in £ tbe I™™ ° f a ^S^^SS 

a short peptide sequence matching consensus sites for phos- P cTal (° 3 (Tiley et al., Virology 178:560-567, 1990). 

phorylation by Casein Kinase II (Marshak and Carroll, CAT activity was measured 48 hr later as described 

Methods Enzymol. 200:134-156, 1991). Such pbosphoryla- (Neumann et aL, BioTechmques 5:444-447, 1987). In con- 

tion would contribute more negative charges to this region. 20 frol experiments, pSV-Tat-SFl or pSV7d and the reporter 

The NH 2 -terminal half of Tat-SFl contained two tandem construct P Myc3ElBLuc (Harper et al., 1996) were intro- 

RNA recognition motifs (Kenan et al., Trends Biochem. ScL duced into HeLa cells together with the plasmids pRCCMV- 

16:214-220, 1991) which have homology to many RNA- TFEB-VP16 (0.3 fig) expressing the TFEB-VP16 fusion 

binding proteins. Interestingly, the first RRM of Tat-SFl protein (Harper et al., 1996). pMyc3ElBLuc contained the 

(amino acid 128 to 217, boxed in FIG. 5A) was similar in 25 hiciferase gene downstream of the Adenovirus E1B pro 



moter with three binding sites for TFEB. Reporter construct 
pGSElBCAT (Lillie et al., Nature 338:39-44, 1989) con- 
taining five GAL4-binding sites inserted upstream of the 
E1B promoter and the CAT gene was used to assay GAM- 
VP 16 trans-activation. The fold activation by Tat or VP16 in 
cells containing the empty vector was assigned a value of 1, 
and activation in the presence of Tat-SFl was adjusted 
accordingly. The mean value from three experiments was 
shown. 

Expression of Tat-SFl from the transfectcd DNA consis- 
tently resulted in an increase in Tat activation by an average 
of 5.2-fold as compared to the control HeLa cells transfectcd 
with an empty vector (FIG. 6 and Tabic I). The enhanced 
activation mediated by Tat-SFl wasTat-specific, since over- 
expression of Tat-SFl had little, or sometimes even a 
slightly negative, effect on transcriptional activation by 
TFEB-VP16. Interestingly, the elevated fold induction by 
Tat resulting from Tat-SFl overexpression was caused by a 

J - .. - ... nriw - , combination of a decrease in the basal level of transcription 

and 52% similar ^ amino acid sequence to the RRM of 45 Qm LTR in the absence of Tat and a small increase 

EWS. Sequence homology similar to that observed between . fc . . - - t j rr,ku n c;„^ 

Tat-SFlUid EWS also exists between Tat-SFl and human m lh Jl c y d 0 ^-activated transcription (Table I) Since 
FUSILS, which is closely related to EWS. The RRMs of Tat-SFl is probably a component of a protein complex that 
other RNA binding proteins are less homologous and show also includes a cclIular « d P^P 5 othcr « lhJ « 

greater variations in length as revealed by the BLAST 50 components, overexpression of Tat-SFl alone may disrupt 



length and displayed the strongest sequence homology to the 
RRMs located in the COOH-terminal half of two closely 
related human proteins, EWS (Delattre et al., Nature 
359:162-165, 1992; Sorensen et al., Nature Genet. 
6:146-151, 1994) (FIG. 5B) and FUS/TLS (Crozat et al., 
Nature 363:640-644, 1993; Rabbitts et al., Nature Genet. 
4:175-180, 1993). Furthermore, the sequence homology 
between Tat-SFl and EWS, or between Tat-SFl and FUS/ 
TLS, extended beyond the two RRMs into the immediate 
NH 2 -terminal region of Tat-SFl (FIGS. 5A and B). The 
amino acid sequences of the homologous regions of Tat-SFl 
(SE0 ID NO:2) and EWS (SEQ ID NO:4) are compared in 
FIG. SB, The amino acids of each protein are numbered next 
to the sequences. Vertical lines and dots indicate identical 
and conserved residues, respectively. EWS has two tandem, 
imperfect repeats (amino acids 209 to 236) that show 
homology to Tat-SFl (amino acids 30 to 44). The alignment 
between the first repeat (amino acids 209-223) of EWS and 
Tat-SFl is shown. The first RRM of Tat-SFl (amino acids 
128 to 446) is almost identical in length, and is 27% identical 
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the normal stoichiometry of the complex resulting in a 
decrease in the basal level of HIV transcription. The pres- 
ence of Tat could stabilize and recruit the active form of the 
complex to the HIV promoter to stimulate the processivity 
of elongation. 

EQUIVALENTS 

Those skilled in the art will recognize, or be able to 
ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be 
encompassed by the following claims. 

All references disclosed herein are incorporated by ref- 
6S erence in their entirety. 

To investigate whether overexpression of Tat-SFl affects A Sequence Listing is presented below and is followed by 
the level of Tat activation in vivo, a plasmid expressing what is claimed: 



algorithm (Altschul et al., J. Mol Biol 215:403-410, 1990). 

These observations suggest that Tat-SFl is related to EWS 
and FUS/TLS, which are members of a novel class of 
putative transcription factors that presumably interact with 
RNA. Both EWS and FUS/TLS are involved in many forms 
of human solid tumors (Ladanyi, Diagn. Mol Pathol 
4:162-173, 1995; Rabbitts, Nature 372:143-149, 1994), 
such as E wing's sarcoma (Delattre et al., 1992; Sorensen et 
al., 1994) and human myxoid liposarcoma (Crozat et al., 
1993; Rabbitts et al., 1993), through chromosomal translo- 
cations. 

Example 7 

Overexpression of Tat-SFl Enhances Tat Activation 
In Vivo 
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SEQUENCE LISTING 



<160> NUMBER OF SEQ ID NOS: 5 

<210> SEQ ID MO 1 

<211> LENGTH J 2815 

<212> TYPE: DMA 

<213> ORGANISM: Homo s apian s 

<220> FEATURE: 

<221> HAMZ/KEY : CDS 

<222> LOCATION f 110.. 2371 

<221> SAKE /KEY : unsure 

<222> LOCATION: 46.. 46 

<223> OTHER IN FORMATION : n - a, c, g or t 

<221> HAME/XEY: unsure 

<222> LOCATION: 2731.. 2731 

<223> OTHER IWTORMATI ON : n - a, c, g or t 

<4 00> SEQUENCE: 1 

gggaaagctg gtacgcctgc aggtaccggt ccggaattcc ggccgngtcg aaagcgtcat SO 

ttcggcctct tagttcttct gaaccetgct cctgagctag gtaggaaac atg age ggc 118 

Met See Gly 
1 

acc aac ttg gat ggg aac gat gag ttt gat gag cag ttg cga atg caa 166 
Thr Asn Leu Asp Gly Asn Asp Glu Phe Asp Glu Gin Leu Arg Hat Gin 
5 10 15 

goo ttg tac gga gac ggc aag gat ggt gac acc cag ace gat gec ggc 214 
Glu Leu Tyr Gly Asp Gly Lys Asp Gly Asp Thr Gin Thr Asp Ala Gly 
20 25 30 35 

gga gaa ccc gat tct etc ggg cag cag ecg acg gac act ccc tac gag 262 
Gly Glu Pro Asp Ser Leu Gly Gin Gin Pro Thr Asp Thr Pro Tyr Glu 
40 45 50 

tgg gac ctg gac aaa aag get tgg ttc ccc aag att act gaa gat ttc 310 
Trp Asp Leu Asp Lys Lys Ala Trp Phe Pro Lys lie Thr Glu Asp Pho 
55 60 65 

att get aca tat cag gec aat tat ggc ttc tct aac gat ggc gca tct 358 
lie Ala Thr Tyr Gin Ala Asn Tyr Gly Phe Ser Asn Asp Gly Ala Ser 
70 75 SO 

agt tct acc gca aat gtt gaa gat gtc cat get agg act gca gag gaa 40 6 

Ser Ser Thr Ala Aen Val Glu Asp Val His Ala Arg Thr Ala Glu Glu 
85 90 95 

cct cca caa gaa aaa gec ccg gaa ccc act gat gec aga aag aag gga 454 
Pro Pro Gin Glu Lys Ala Pro Glu Pro Thr Asp Ala Arg Lys Lys Gly 
100 105 110 115 

gaa aaa aga aag get gag tea gga tgg ttt cat gtt gaa gaa gac aga 502 
Glu Lys Arg Lya Ala Glu Ser Gly Trp Phe His Val Glu Glu Aop Arg 
120 125 130 

aat aca aat gta tac gtg tct ggt ttg cct cca gat att aca gtg gat 55 0 

Asn Thr Asn Val Tyr Val Ser Gly Leu Pro Pro Asp II© Thr Val Asp 
135 140 145 

gaa ttt ata caa ett atg tec aag ttt ggc att att atg aga gat cct 598 
Glu Phe He Gin Leu Met Ser Lys Phe Gly He He Met Arg Asp Pro 
150 155 160 

cag aca gaa gaa ttt aag gtc aaa ctt tac aaa gat aat caa gga aat 64 6 

Gin Thr Glu Glu Phe Lys Val Lys Leu Tyr Lya Asp Asn Gin Gly Aan 
165 170 175 

ctt aaa gga gac ggt ctt tgc tgt tat ttg aaa aga gaa tct gtg gaa 694 
Leu Lys Gly Asp Gly Leu Cys Cys Tyr Leu Lys Arg Glu Ser Val Glu 
180 185 190 195 

ctt gca tta aaa ctt ttg gat gaa gat gaa att aga ggc tac aaa tta 742 
Leu Ala Leu Lya Leu Leu Asp Glu Aap Glu He Arg Gly Tyr Lys Leu 
200 205 210 
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-continued 

cat gtt gag gtg gca aag ttt caa ctg aag gga gaa rat gat gcc tea 790 
Hie Val Glu Val Ala Lye Pho Gin Leu Lya Gly Glu Tyr Aep Ala Ser 
215 220 225 

aag aag aag aag aag tgc aaa gac tat aag aag aag ctg tct atg caa 838 
Lys Lys Lys Lya Lya Cys Lya Asp Tyr Lya Lya Lye Leu Ser Hex Gin 
230 235 240 

caa aag cag ttg gat tgg aga cct gag agg cga gcc gga cca tec egg 88 6 

Gin Lya Gin Leu Aap Trp Arg Pro Glu Arg Arg Ala Gly Pro Ser Arg 
245 250 255 

atg cgc cat gag cga gtt gtc ate ate aag aat atg ttt cat cct atg 934 
Met Arg His Glu Arg Val Val He He Lys Asn Met Phe His Pro Met 
260 265 270 275 

gat ttt gag gat gat ccg ttg gtg ctg aat gag ate aga gaa gac ctt 982 
Asp Phe Glu Asp Asp Pro Leu Val Leu Asn Glu tie Arg Glu Asp Leu 
280 285 290 

cga gta gag tgt teg aag ttt gga caa att agg aaa etc ctt etc ttt 1030 
Arg Val Glu Cys Ser Lye Phe Gly Gin He Arg Lye Leu Leu Leu Phe 
295 300 305 

gat agg cac cca gat ggt gtg gcc tct gtg tec ttt egg gat cca gag 1078 
Asp Arg His Pro Asp Gly Val Ala Ser Val Ser Phe Arg Asp Pro Glu 
310 315 320 

gaa get gat tat tgt att cag act etc gat gga aga tgg ttt ggt ggc 1126 
Glu Ala Asp Tyr Cys He Gin Thr Leu Asp Gly Arg Trp Phe Gly Gly 
325 330 335 

cgt caa ate act gcc cag gca tgg gat ggg act aca gat tat cag gtg 1174 
Arg Gin He Thr Ala Gin Ala Trp Asp Gly Thr Thr Asp Tyr Gin Val 
340 345 350 355 

gag gaa acc tea aga gaa agg gag gaa agg ctg aga gga tgg gag get 1222 
Glu Glu Thr Ser Arg Glu Arg Glu Glu Arg Leu Arg Gly Trp Glu Ala 
360 365 370 

ttc etc aat get cet gag gcc aac aga ggc ctt age gtt cag att ctg 1270 
Phe Leu Asn Ala Pro Glu Ala Asn Arg Gly Leu Ser Val Gin He Leu 
375 380 385 

tct ctg ctt cga aag gca ggg cct tct aga gca agg cat ttt tea gag 1318 
Ser Leu Leu Arg Lys Ala Gly Pro Ser Arg Ala Arg His Phe Ser Glu 
390 395 400 

cac ccc age aca tct aaa atg aat get caa gaa act gca act gga atg 136 6 

His Pro Ser Thr Ser Lys Met Asn Ala Gin Glu Thr Ala Thr Gly Met 
405 410 415 

gca ttt gaa gaa cct ata gat gag aag aag ttt gaa aag aca gaa gat 1414 
Ala Phe Glu Glu Pro He Asp Glu Lya Lys Phe Glu Lye Thr Glu Asp 
420 425 430 435 

ggg gga gaa ttt gaa gaa ggt get tct gaa sac aat get aag gaa agt 1462 
Gly Gly Glu Phe Glu Glu Gly Ala Ser Glu Asn Asn Ala Lys Glu Ser 
440 445 450 

age ccc gaa aaa gag get gaa gaa ggc tgc cct gaa aaa gaa tct gaa 1510 
Ser Pro Glu Lye Glu Ala Glu Glu Gly Cys Pro Glu Lya Glu Ser Glu 
455 460 465 

gag ggc tgc ccc aaa aga ggg ttt gaa ggc age tgc tec caa aaa gag 1558 
Glu Gly Cys Pro Lys Arg Gly Phe Glu Gly Ser Cys Ser Gin Lys Glu 
470 475 480 

tct gaa gaa ggc aat ccc gta aga gga tct gaa gag gat agt cct aaa 1606 
Ser Glu Glu Gly Asn Pro Val Arg Gly Ser Glu Glu Asp Ser Pro Lys 
4BS 490 495 

aaa gag tct aaa aag aag aca etc aaa aat gat tgt gaa gag aat ggc 1654 
Lye Glu Ser Lya Lys Lye Thr Leu Lye Asn Asp Cys Glu Glu Asn Gly 
500 505 510 515 

ctt gca aag gaa tct gaa gat gac etc aac aag gag tct gaa gag gag 1702 
Leu Ala Lys Glu Ser Glu Asp Asp Leu Asn Lye Glu Ser Glu Glu Glu 
520 525 530 
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-continued 



gtt ggc ccc aca aaa gag tec gaa gaa gat gac 
Val Gly Pro Thr Lys Glu Ser Glu Glu Asp Asp 
535 540 



tea gag aaa gag tct 
Ser Glu Lys Glu Ser 
545 



gar gaa gac tgc tct gaa aaa cag tct gaa gat 

Asp Glu Asp Cys Ser Glu Lys Gin Ser Glu Asp 

550 555 

ttt gaa gaa aat ggt etc gag aaa gat ttg gac 

Phe Glu Glu Asn Gly Leu Glu Lys Asp Leu Asp 
565 570 

aag gag ctt cat gaa aat gtt ctt gac aaa gag 

Lys Glu Leu His Glu Asn Val Leu Asp Lys Glu 
580 585 590 

tct gaa aac tec gaa ttt gaa gat gac ggc tct 

Ser Glu Asn Ser Glu Phe Glu Asp Asp Gly Ser 

600 605 

gag gaa ggc tct gag aga gag ttt gac gaa gat 

Glu Glu Gly Sec Glu Arg Glu Phe Asp Glu Asp 

615 620 



ggc tec gaa aga gaa 

Gly Sar Glu Arg Glu 
560 

gag gaa ggt tct gaa 

Glu Glu Gly Ser Glu 
575 

tta gaa gaa aat gac 

Leu Glu Glu Asn Asp 
595 

gaa aaa gtg tta gat 

Glu Lys Val Leu Asp 
610 

tea gat gaa aag gaa 

Ser Asp Glu Lye Glu 
625 



1942 



1990 



gaa gag gag gat aca tat gaa aaa gta ttt gat 

Glu Glu Glu Asp Thr Tyr Glu Lyo Val Phe Asp 
630 635 

aaa gag gat gaa gaa tat gca gat gaa aag ggg 

Lys Glu Asp Glu Glu Tyr Ala Asp Glu Lys Gly 
645 650 

aaa aag gcg gaa gaa ggt gat gca gat gaa aag 

Lys Lys Ala Glu Glu Gly Asp Ala Asp Glu Lys 

660 665 670 



gat gag tct gat gag 

Asp Glu Ser Asp Glu 
640 

ctt gaa get get gat 

Leu Glu Ala Ala Asp 
655 

ctg ttt gaa gag tea 

Leu Phe Glu Glu Ser 
675 



2086 



2134 



gat gac aag gaa gat gaa gat gca gat gga aag 
Asp Asp Lys Glu Asp Glu Asp Ala Asp Gly Lys 
680 685 



gaa gtt gaa gat get 
Glu Val Glu Asp Ala 
690 



2182 



gac gaa aag ttg ttc gaa gat gat gat tec aat 
Asp Glu Lys Leu Phe Glu Asp Asp Asp Ser Asn 
695 700 



gag sag ttg ttt gat 
Glu Lys Leu Phe Asp 
705 



2230 



gag gag gaa gat tec agt gag aag ttg ttt gac 
Glu Glu Glu Asp Ser Ser Glu Lys Leu Phe Asp 
710 715 



gat tct gat gag agg 
Asp Ser Asp Glu Arg 
720 



2278 



ggg act ttg ggt ggt ttt ggg agt gtt gaa gaa 
Gly Thr Leu Gly Gly Phe Gly Ser Val Glu Glu 
725 730 



ggg ccc eta tec act 
Gly Pro Leu Ser Thr 
735 



2326 



ggc age age ttt att etc agt age gat gat gat 
Gly Ser Ser Phe lie Leu Ser Ser Asp Asp Asp 
740 745 750 



gac gat gat att taatc 
Asp Asp Asp He 



2376 



cettaaactt 


gctttttagg 


gagagtcctc catctacatt 


tgcctgtgct tcagggtaat 


2436 


tactagtagt 


gttacatgaa 


catgtgcata gtggtaggat 


gecatcagat taaagcattg 


2496 


aagtgtttca 


ttgttacctg 


tacctaatgg ttttaeatat 


atgttaattg attgtttagt 


2556 


taaaatgtca 


tagttacaat 


gcaagtaaac tggatacttg ttcttttgtc agatttgtta 


2616 


aatgeatgea 


gaataatatt 


tttaagagta ttgattgaag 


tttgtgatat tcatcaataa 


2676 


aaatgagttg 


ataatatgea 


gaaactgaaa aaaaaaaaaa 


aaaaaaaagt cgacneggcc 


2736 


ggaattcccg 


ggtcgacgag 


ctcactagtc ggcggccgrt 


ctagaggatc caagcttacg 


2796 


tacgegtgea 


tgcgacgtc 






2B15 



<210> SEQ 10 HO 2 

<2U> LENGTH : 754 

<212> TYPE: PRT 

<213> ORGANISM: Homo sapiens 
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<4 00> SEQUENCE : 2 

Met Ser Gly Thr Asn Leu Asp Gly Asn Asp Glu Phe Asp Glu Gin Leu 
15 10 15 

Arg Met Gin Glu Leu Tyr Gly Asp Gly Lys Asp Gly Asp Thr Gin Thr 
20 25 30 

Asp Ala Gly Gly Glu Pro Asp Sar Lou Gly Gin Gin Pro Thr Asp Thr 
35 40 45 

Pro Tyr Glu Trp Asp Leu Asp Lys Lys Ala Trp Phe Pro Lya lie Thr 
50 55 60 

Glu Asp Phe He Ala Thr Tyr Gin Ala Asn Tyr Gly Phe Ser Asn Asp 
65 70 75 80 

Gly Ala Ser Ser Ser Thr Ala Asn Val Glu Asp val His Ala Arg Thr 
85 90 95 

Ala Glu Glu Pro Pro Gin Glu Lys Ala Pro Glu Pro Thr Asp Ala Arg 
100 105 110 

Lya Lys Gly Glu Lys Arg Lye Ala Glu Ser Gly Trp Phe His Val Glu 
115 120 125 

Glu Asp Arg Aon Thr Asn Val Tyr Val Ser Gly Leu Pro Pro Asp He 
130 135 140 

Thr Val Asp Glu Phe He Gin Leu Met Ser Lys Phe Gly lie lie Met 
145 150 155 160 

Arg Asp Pro Gin Thr Glu Glu Phe Lys Val Lys Leu Tyr Lys Asp Asn 
165 170 175 

Gin Gly Asn Leu Lys Gly Asp Cly Leu Cys Cys Tyr Lou Lys Arg Glu 
180 165 190 

Ser Val Glu Leu Ala Leu Lys Leu Leu Asp Glu Asp Glu He Arg Gly 
195 200 205 

Tyr Lyo Leu His Val Glu Val Ala Lya Phe Gin Leu Lys Gly Glu Tyr 
210 215 220 

Asp Ala Ser Lyo Lys Lyo Lys Lyo Cys Lya Asp Tyr Lys Lya Lys Leu 
225 230 235 240 

Ser Met Gin Gin Lys Gin Leu Aop Trp Arg Pro Glu Arg Arg Ala Gly 
245 250 255 

Pro Ser Arg Met Arg His Glu Arg Val Val He He Lys Asn Mat Phe 
260 265 270 

His Pro Hot Asp Phe Glu Asp Aap Pro Leu Val Leu Asn Glu He Arg 
275 280 285 

Glu Asp Leu Arg Val Glu Cya Ser Lys Phe Gly Gin He Arg Lys Leu 
290 295 300 

Leu Leu Phe Asp Arg His Pro Asp Gly Val Ala Sex Val Ser Phe Arg 
305 310 315 320 

Asp Pro Glu Glu Ala Aop Tyr Cys He Gin Thr Leu Asp Gly Arg Trp 
325 330 335 

Phe Gly Gly Arg Gin He Thr Ala Gin Ala Trp Asp Gly Thr Thr Asp 
340 345 350 

Tyr Gin Val Glu Glu Thr Ser Arg Glu Arg Glu Glu Arg Leu Arg Gly 
355 360 365 

Trp Glu Ala Phe Leu Asn Ala Pro Glu Ala Asn Arg Gly Leu Ser Val 
370 375 380 

Gin He Leu Ser Leu Leu Arg Lys Ala Gly Pro Ser Arg Ala Arg His 
385 390 395 400 

Phe Ser Glu His Pro Ser Thr Ser Lys Met Aon Ala Gin Glu Thr Ala 
405 410 415 
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Thr Gly Met Ala Phe Glu Glu Pro lie Asp Glu Lys Lya Phe Glu Lys 
420 425 430 

Thr Glu Asp Gly Gly Glu Phe Glu Glu Gly Ala Ser Glu Asn Asn Ala 
435 440 445 

Lys Glu Ser Ser Pro Glu Lys Glu Ala Glu Glu Gly Cys Pro Glu Lys 
450 455 460 

Glu Ser Glu Glu Gly Cys Pro Lya Arg Gly Phe Glu Gly Ser Cys Ser 
465 470 475 480 

Gin Lys Glu Ser Glu Glu Gly Asn Pro Val Arg Gly Ser Glu Glu Asp 
465 490 495 

Ser Pro Lys Lys Glu Ser Lys Lys Lye Thr Leu Lys Asn Asp Cys Glu 
500 505 510 

Glu Asn Gly Leu Ala Lya Glu Ser Glu Asp Asp Leu Asn Lys Glu Ser 
515 S20 525 

Glu Glu Glu Val Gly Pro Thr Lys Glu Ser Glu Glu Asp Asp Ser Glu 
530 535 540 

Lys Glu Ser Asp Glu Asp Cys Ser Glu Lys Gin Ser Glu Asp Gly Ser 
545 550 555 560 

Glu Arg Glu Phe Glu Glu Asn Gly Leu Glu Lys Asp Leu Asp Glu Glu 
565 570 575 

Gly Ser Glu Lys Glu Leu His Glu Aan Val Leu Aap Lys Glu Leu Glu 
580 585 590 

Glu Asn Asp Ser Glu Asn Ser Glu Phe Glu Asp Asp Gly Ser Glu Lys 
595 600 605 

Val Leu Asp Glu Glu Gly Ser Glu Arg Glu Phe Asp Glu Asp Ser Asp 
€10 615 (20 

Glu Lys Glu Glu Glu Glu Asp Thr Tyr Glu Lys Val Phe Asp Asp Glu 
625 630 635 640 

Ser Asp Glu Lys Glu Asp Glu Glu Tyr Ala Asp Glu Lys Gly Leu Glu 
645 650 655 

Ala Ala Asp Lys Lys Ala Glu Glu Gly Asp Ala Asp Glu Lys Leu Phe 
660 665 670 

Glu Glu Ser Asp Asp Lys Glu Asp Glu Asp Ala Asp Gly Lys Glu Val 
675 6S0 685 

Glu Asp Ala Asp Glu Lys Leu Phe Glu Asp Asp Asp Ser Asn Glu Lys 
690 695 700 

Leu Phe Asp Glu Glu Glu Asp Ser Ser Glu Lys Leu Phe Asp Asp Ser 
705 710 715 720 

Asp Glu Arg Gly Thr Leu Gly Gly Phe Gly Ser Val Glu Clu Gly Pro 
725 730 735 

Leu Ser Thr Gly Ser Ser Phe lie Leu Ser Ser Asp Asp Asp Aap Asp 
740 745 750 

Asp He 



<210> SEQ 10 DO 3 

<211> LENGTH i 19 

<212> TYPE: PRT 

<213> ORGANISM: Homo sapiens 

<4 00> SEQUENCE: 3 

Lys Met Asn Ala Gin Glu Thr Ala Thr Gly Met: Ala phe Glu Glu Pro 
15 10 15 

He Asp Glu 
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<210> SEQ ID HO 4 

<2 1 1> LENGTH: 656 

<212> TYPE: FRT 

<213> ORGANISM: Homo sapiens 

<400> SEQUENCE: 4 

Met Ala Ser Thr Asp Tyr Ser Thr Tyr Ser Gin Ala Ala Ala Gin Gin 
15 10 15 

Gly Tyr Ser Ala Tyr Thr Ala Gin Pro Thr Gin Gly Tyr Ala Gin Thr 
20 25 30 

Thr Gin Ala Tyr Gly Gin Gin Ser Tyr Gly Thr Tyr Gly Gin Pro Thr 
35 40 45 

Asp Val Ser Tyr Thr Gin Ala Gin Thr Thr Ala Thr Tyr Gly Gin Thr 
50 55 60 

Ala Tyr Ala Thr Ser Tyr Gly Gin Pro Pro Thr Gly Tyr Thr Thr Pro 
65 70 75 80 

Thr Ala Pro Gin Ala Tyr Ser Gin Pro Val Gin Gly Tyr Gly Thr Gly 
85 90 95 

Ala Tyr Asp Thr Thr Thr Ala Thr Val Thr Thr Thr Gin Ala Ser Tyr 
100 105 110 

Ala Ala Gin Ser Ala Tyr Gly Thr Gin Pro Ala Tyr Pro Ala Tyr Gly 
115 120 125 

Gin Gin Pro Ala Ala Thr Ala Pro Thr Arg Pro Gin Asp Gly Aan Lys 
130 135 140 

Pro Thr Glu Thr Ser Gin Pro Gin Ser Ser Thr Gly Gly Tyr Aon Gin 
145 150 155 160 

Pro Ser Leu Gly Tyr Gly Gin Ser Aon Tyr Ser Tyr Pro Gin Val Pro 
165 170 175 

Gly Ser Tyr Pro Met Gin Pro Val Thr Ala Pro Pro Ser Tyr Pro Pro 
180 185 190 

Thr Ser Tyr Ser Ser Thr Gin Pro Thr Ser Tyr Asp Gin Ser Ser Tyr 
195 200 205 

Ser Gin Gin Ann Thr Tyr Gly Cln Pro Ser Ser Tyr Gly Gin Gin Ser 
210 215 220 

Ser Tyr Gly Gin Gin Ser Ser Tyr Gly Gin Gin Pro Pro Thr Ser Tyr 
225 230 235 240 

Pro Pro Gin Thr Gly Ser Tyr Ser Gin Alo Pro Ser Gin Tyr Ser Gin 
245 250 255 

Gin Ser Ser Ser Tyr Gly Gin Gin Ser Ser Phe Arg Gin Asp HIb Pro 
2(0 265 270 

Ser Ser Met Gly Val Tyr Gly Gin Glu Ser Gly Gly Phe Ser Gly Pro 
275 280 2B5 

Gly Glu Asn Arg Ser Met Ser Gly Pro Asp Asn Arg Gly Arg Gly Arg 
290 295 300 

Gly Gly Phe Asp Arg Gly Gly Met Ser Arg Gly Gly Arg Gly Gly Gly 
305 310 315 320 

Arg Gly Gly Met Gly Ser Ala Gly Glu Arg Gly Gly Phe Asn Lys Pro 
325 330 335 

Gly Gly Pro Met Asp Glu Gly Pro Asp Leu Asp Leu Gly Pro Pro Val 
340 34S 350 

Asp Pro Asp Clu Asp Ser Asp Asn Ser Ala lie Tyr Val Gin Gly Leu 
355 360 365 

Asn Asp Ser Val Thr Leu Asp Asp Leu Ala Asp Phe Phe Lys Gin Cys 
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370 375 380 

Gly Val Vol Lys Met Asn Lya Arg Thr Gly Gin Pro Met lie His lie 
385 390 395 400 

Tyr Leu Asp Lys Glu Thr Gly Lys Pro Lye Gly Asp Ala Thr Val Ser 
405 410 415 

Tyr Glu Asp Pro Pro Thr Ala Lys Ala Ala Val Glu Trp Phe Asp Gly 
420 425 430 

Lya Aep Phe Gin Gly Ser Lye Leu Lyo Val Ser Leu Ala Arg Lye Lya 
435 440 445 

Pro Pro Met Aan Ser Met Arg Gly Gly Leu Pro Pro Arg Glu Gly Arg 
45D 455 460 

Gly Met Pro Pro Pro Leu Arg Gly Gly Pro Gly Gly Pro Gly Gly Pro 
465 470 475 480 

Gly Gly Pro Met Gly Arg Met Gly Gly Arg Gly Gly Asp Arg Gly Gly 
485 490 495 

Phe Pro Pro Arg Gly Pro Arg Gly Sar Arg Gly Asn Pro Ser Gly Gly 
500 505 510 

Gly Aan Val Gin His Arg Ala Gly Asp Trp Gin Cyo Pro Asn Pro Gly 
515 520 525 

Cys Gly Asn Gin Asn Phe Ala Trp Arg Thr Glu Cys Asn Gin Cys Lys 
530 535 540 

Ala Pro Lys Pro Glu Gly Phe Leu Pro Pro Pro Phe Pro Pro Pro Gly 
545 550 555 560 

Gly Asp Arg Gly Arg Gly Gly Pro Gly Gly Met Arg Gly Gly Arg Gly 
565 570 575 

Gly Leu Met Asp Arg Gly Gly Pro Gly Gly Met Phe Arg Gly Gly Arg 
580 585 590 

Gly Gly Asp Arg Gly Gly Phe Arg Gly Gly Arg Gly Hat Asp Arg Gly 
595 600 605 

Gly Phe Gly Gly Gly Arg Arg Gly Gly Pro Gly Gly Pro Pro Gly Pro 
610 615 620 

Leu Met Glu Gin Met Gly Gly Arg Arg Gly Gly Arg Gly Gly Pro Gly 
625 630 635 640 

Lys Met Asp Lys Gly Glu Kis Arg Gin Glu Arg Arg Asp Arg Pro Tyr 
645 650 655 



<210> SEQ ID WO 5 

<211> LENGTH: 2672 

<212> TYPE: DMA 

<213> ORGANISM: Homo sapiens 

<220> FEATURE t 

<221> HAMS /KEY : CDS 

<222> LOCATION: 58.. 23 19 

<400> SEQUENCE! 5 

agcgtcattt cggcctctta gttcttctge accctgctcc tgagctaggt aggaaac atg 60 

Met 
1 

age ggc ace aac ttg gat ggg aac gat gag ttt gat gag cag ttg cga 108 
Ser Gly Thr Asn Leu Asp Gly Asn Asp Glu Phe Asp Glu Gin Leu Arg 
5 10 15 

atg caa gaa ttg tac gga gac ggc aag gat ggt gac aec cag acc gat 156 
Met Gin Glu Leu Tyr Gly Asp Gly Lys Asp Gly Asp Thr Gin Thr Aap 
20 25 30 

gec ggc gga gaa ccc gat tct etc ggg cag cag ccg acg gac act ccc 204 
Ala Gly Gly Glu Pro Asp Ser Leu Gly Gin Gin Pro Thr Asp Thr Pro 
35 40 45 
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tac gag tgg gac ctg gac aaa aag get tgg ttc ccc aag att act gaa 
Tyr Glu Trp Asp Leu Asp Lys Lys Ala Trp Phe Pro Lya lie Thr Glu 
50 55 60 65 

gat ttc att get aca tat cag gee aat tat gge ttc tct aac gat ggc 
Asp Phe lie Ala Thr Tyr Gin Ala Asn Tyr Gly Phe Ser Ann Asp Gly 
70 75 80 



300 



gca tct agt tct acc gca aat gtt gaa gat gtc cat get agg act gca 348 
Ala Ser Ser Ser Thr Ala Asn Val Glu Asp Val His Ala Arg Thr Ala 
85 90 95 

gag gaa cct cca caa gaa aaa gec ccg gaa ecc act gat gec aga aag 39 6 

Glu Glu Pro Pro Gin Glu Lyo Ala Pro Glu Pro Thz Asp Ala Arg Lya 
100 105 110 

aag gga gaa aaa aga aag get gag tea gga tgg ttt cat gtt gaa gaa 44 4 

Lya Gly Glu Lya Arg Lya Ala Glu Ser Gly Trp Phe His Val Glu Glu 
115 120 125 

gac aga aat aca aat gta tac gtg tct ggt ttg cct cca gat att aca 492 
Asp Arg Asn Thr Asn Val Tyr Val Ser Gly Leu Pro Pro Asp lie Thr 
130 135 140 145 

gtg gar gaa ttt ata caa ett atg tec aag ttt ggc att att atg aga 54 0 

Val Asp Glu Phe lie Gin Leu Met Ser Lya Phe Gly lie He Met Arg 
150 155 160 

got cct cag aca gaa gaa ttt aag gtc aaa ctt tac aoa gat aat caa 588 
Asp Pro Gin Thr Glu Glu Phe Lys Val Lys Leu Tyr Lys Asp Asn Gin 
165 170 175 

gga aat ctt aaa gga gac ggt ctt tgc tgt tat ttg aaa aga gaa tct 63 6 

Gly Asn Leu Lys Gly Asp Gly Leu Cya Cya Tyr Leu Lya Arg Glu Ser 
180 185 190 

gtg gaa ctt gca tta aaa ctt ttg gat gaa gat gaa att aga ggc tac 68 4 

Val Glu Leu Ala Leu Lye Leu Leu Asp Glu Asp Glu He Arg Gly Tyr 
195 200 205 

aaa tta cat gtt gag gtg gca aag ttt caa ctg aag gga gaa tat gat 732 
Lys Leu His Val Glu Val Ala Lys Phe Gin Leu Lye Gly Glu Tyr Asp 
210 215 220 225 

gec tea aag aag aag aag aag tgc aaa gac tat aag aag aag ctg tct 78 0 

Ala Ser Lya Lya Lya Lya Lya Cye Lya Asp Tyr Lys Lya Lye Leu Ser 
230 235 240 

atg caa caa aag cag ttg gat tgg aga cct gag agg cga gec gga cca 82 B 

Mot Gin Gin Lys Gin Leu Asp Trp Arg Pro Glu Arg Arg Ala Gly Pro 
245 250 255 

tec egg atg cgc cat gag cga gtt gtc ate ate aag aat atg ttt cat 67 6 

Ser Arg Met Arg Hie Glu Arg Val Val He He Lys Asn Met Phe His 
260 265 270 

cct atg gat ttt gag gat gat ccg ttg gtg ctg aat gag ate aga gaa 92 4 

Pro Met Asp Phe Glu Asp Asp Pro Leu Val Leu Asn Glu He Arg Glu 
275 280 285 

gac ctt cga gta gag tgt teg aag ttt gga caa att agg aaa etc ctt 972 
Asp Leu Arg Val Glu Cys Ser Lys Phe Gly Gin He Arg Lys Leu Leu 
290 295 300 305 

etc ttt gat agg cac cca gat ggt gtg gee tct gtg tec ttt egg gat 102 0 

Leu Phe Aap Arg Hia Pro Asp Gly Val Ala Ser Val Ser Phe Arg Asp 
310 315 320 

cca gag gaa get gat tat tgt att cag act etc gat gga aga tgg ttt 1068 
Pro Glu Glu Ala Asp Tyr Cys He Gin Thr Leu Asp Gly Arg Trp Phe 
325 330 335 

ggt ggc cgt caa ate act gec cag gca tgg gat ggg act aca gat tat 1116 
Gly Gly Arg Gin He Thr Ala Gin Ala Trp Asp Gly Thr Thr Asp Tyr 
340 345 350 

cag gtg gag gaa acc tea aga gaa agg gag gaa agg ctg aga gga tgg 1164 
Gin Val Glu Glu Thr Ser Arg Glu Arg Glu Glu Arg Leu Arg Gly Trp 
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355 360 365 

gag get ttc etc aat get cct gag gee aac aga ggc ctt age gtt cog 1212 
Glu Ala Pbe Leu Aon Ala Pro Glu Ala Asn Arg Gly Leu Sor Vol Gin 
370 375 380 385 

att ctg tct ctg ctt cga sag gca ggg cct tct aga gca agg cat ttt 1260 
He Leu Ser Leu Leu Arg Lys Ala Gly Pro Ser Arg Ala Arg His phe 
390 395 400 

tea gag cac ccc age aca tct aaa atg aat get caa gaa act gca act 1308 
Ser Glu His Pro Ser Thr Ser Lys Met Aan Ala Gin Glu Thr Ala Thr 
405 410 415 

gga atg gca ttt gaa gaa cct ata gat gag aag aag ttt gaa aag aca 1356 
Gly Met Ala Phe Glu Glu Pro He Asp Glu Lys Lys Phe Glu Lys Thr 
420 425 430 

gaa gat ggg gga gaa ttt gaa gaa ggt get tct gaa aac aat get aag 140 4 
Glu Asp Gly Gly Glu Phe Glu Glu Gly Ala Ser Glu Asn Aan Ala Lys 
435 440 445 

gaa agt age ccc gaa aaa gag get gaa gaa ggc tgc cct gaa aaa gaa 1452 
Glu Ser Ser Pro Glu Lys Glu Ala Glu Glu Gly Cys Pro Glu Lys Glu 
450 455 460 465 

tct gaa gag ggc tgc ccc aaa aga ggg ttt gaa ggc age tgc tec caa 1500 
Ser Glu Glu Gly Cys Pro Lys Arg Gly Phe Glu Gly Ser Cys Ser Gin 
470 475 480 

aaa gag tct gaa gaa ggc aat ccc gta aga gga tct gaa gag gat agt 154 8 
Lys Glu Ser Glu Glu Gly Asn Pro Val Arg Gly Ser Glu Glu Aap Ser 
485 490 495 

cct aaa aaa gag tct aaa aag aag aca etc aaa aat gat tgt gaa gag 1596 
Pro Lys Lys Clu Ser Lys Lys Lys Thr Leu Lys Asn Asp Cys Glu Glu 

500 505 510 

aat ggc ctt gca aag gaa tct gaa gat gac etc aac aag gag tct gaa 164 4 
Asn Gly Leu Ala Lys Glu Ser Glu Asp Asp Leu Asn Lys Glu Ser Glu 

515 520 525 

gag gag gtt ggc ccc aca aaa gag tec gaa gaa gat gac tea gag aaa 1692 
Glu Glu Val Gly Pro Thr Lys Glu Ser Glu Glu Asp Asp Ser Glu Lys 
530 535 540 545 

gag tct gat gaa gac tgc tct gaa aaa cag tct gaa gat ggc tec gaa 1740 
Glu Ser Asp Glu Asp Cys Ser Glu Lys Gin Ser Glu Asp Gly Ser Glu 
550 S55 560 

aga gaa ttt gaa gaa aat ggt etc gag aaa gat ttg gac gag gaa ggt 178S 
Arg Glu Phe Glu Glu Asn Gly Leu Glu Lys Asp Leu Aap Glu Glu Gly 
565 570 575 

tct gaa aag gag ctt cat gaa aat gtt ctt gac aaa gag tta gaa gaa 1836 
Ser Glu Lys Glu Leu His Glu Asn Val Leu Asp Lys Glu Leu Glu Glu 
580 585 590 

aat gac tct gaa aac tec gaa ttt gaa gat gac ggc tct gaa aaa gtg 1884 
Asn Asp Ser Glu Asn Ser Glu Phe Glu Asp Asp Gly Ser Glu Lys Val 
595 600 605 

tta gat gag gaa ggc tct gag aga gag ttt gac gaa gat tea gat gaa 1932 
Leu Asp Glu Clu Gly Ser Glu Arg Glu Phe Asp Glu Asp Ser Asp Glu 
610 615 620 625 

aag gaa gaa gag gag gat aca tat gaa aaa gta ttt gat gat gag tct 1980 
Lys Glu Glu Glu Glu Asp Thr Tyr Glu Lys Val Phe Asp Asp Glu Ser 
630 635 640 

gat gag aaa gag gat gaa gaa tat gca gat gaa aag ggg ctt gaa get 2028 
Asp Glu Lys Glu Asp Glu Glu Tyr Ala Asp Glu Lys Gly Leu Glu Ala 
645 650 655 

get gat aaa aag gcg gaa gaa ggt gat gca gat gaa aag ctg ttt gaa 2076 
Ala Asp Lys Lys Ala Glu Glu Gly Asp Ala Asp Glu Lys Leu Phe Glu 
660 665 670 

gag tea gat gac aag gaa gat gaa gat gca gat gga aag gaa gtt gaa 2124 
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Glu Ser Asp Asp Lya Glu Asp Glu Asp Ala Asp Gly Lys Glu Val Glu 
675 680 «85 

gat get gac gaa aag ttg tte gaa gat gat gat tec aat gag aag ttg 2172 
Asp Ala Asp Glu Lys Leu Phe Glu Asp Asp Asp Ser Asn Glu Lya Leu 
690 695 700 705 

ttt gat gag gag gaa gat tec egt gag aag ttg ttt gac gat tct gat 2220 
Phe Asp Glu Glu Glu Asp Ser Ser Glu Lya Leu Phe Asp Asp Ser Asp 
710 715 720 

gag agg ggg act ttg ggt ggt ttt ggg agt gtt gaa gaa ggg ccc eta 226 8 
Glu Arg Gly Thr Leu Gly Gly Phe Gly Ser Val Glu Glu Gly Pro Leu 
725 730 735 

tee act ggc age age ttt att etc agt age gat gat gat gac gat gat 2316 
Ser Thr Gly Ser Ser Phe He Leu Ser Ser Asp Asp Asp Asp Aop Asp 
740 745 750 

att taatccctta oacttgcttt ttagggagag tcctccatct acatttgect gtgett 237 5 
He 

cagggtaatt actagtagtg ttacatgaac atgtgcatag tggtaggatg ccatcagatt 24 35 

aaagcattga agtgtttcat tgttacctgt acctaatggt tttaaatato tgttaattga 249 5 

ttgtttagtt oaaatgtcat agttacaatg caagtoaact ggatacttgt tcttttgtca 2555 

gatttgttaa atgcatgcag aataatattt ttaagagtat tgattgaagt ttgtgatatt 2615 

catcaataaa aatgagttga taatatgcag aaactgaaaa aaaaaeaaaa aaaaaaa 2672 



What is claimed is: 

1. An isolated nucleic acid molecule encoding a Tat- 
Stimulatory Factor protein selected from the group consist- 
ing of 

(a) nucleic acid molecules which hybridize under strin- 35 
gent conditions to a molecule consisting of tie coding 
region of the nucleic acid sequence of SEQ ID N0:1 
and which codes for the Tat-Stimulatory Factor protein, 

(b) nucleic acid molecules that differ from the nucleic acid 
molecules of (a) in codon sequence due to the degen- 40 
eracy of the genetic code, and 

(c) complements of (a) and (b). 

2. The isolated nucleic acid molecule of claim 1, wherein 
the isolated nucleic acid molecule comprises the coding 
region of SEQ ID NO:l. 

3. The isolated nucleic acid molecule of claim 1, wherein 
the isolated nucleic acid molecule consists of the coding 
region of SEQ ID NO:l. 

4. An isolated nucleic acid molecule selected from the sQ 
group consisting of (a) a unique fragment of SEQ ID NO: 1 
between 12 and 2650 nucleotides in length, and (b) comple- 
ments of (a). 

5. The isolated nucleic acid molecule of claim 4 f wherein 
the isolated nucleic acid molecule is a unique fragment of ^ 
the coding region of SEQ ID NO:l and wherein the isolated 
nucleic acid molecule is selected from the group consisting 
of at least 14 contiguous nucleotides of SEQ ID N0:1, and 
(b) complements of (a). 

6. The isolated nucleic acid molecule of claim 4, wherein 
the isolated nucleic acid molecule is selected from the group 



consisting of (a) at least 15 contiguous nucleotides of SEQ 
ID NO:l, and (b) complements of (a). 

7. The isolated nucleic acid molecule of claim 4, wherein 
the isolated nucleic acid molecule is selected from the group 
consisting of (a) at least 16 contiguous nucleotides of SEQ 
ID NO:l, and (b) complements of (a). 

8. The isolated nucleic acid molecule of claim 4, wherein 
the isolated nucleic acid molecule is selected from the group 
consisting of (a) at least 17 contiguous nucleotides of SEQ 
ID NO:l, and (b) complements of (a). 

9. The isolated nucleic acid molecule of claim 4, wherein 
the isolated nucleic acid molecule is selected from the group 
consisting of (a) at least 18 contiguous nucleotides of SEQ 
ID NO.:l, and (b) complements of (a). 

10. The isolated nucleic acid molecule of claim 4, wherein 
the isolated nucleic acid molecule is selected from the group 
consisting of (a) at least 20 contiguous nucleotides of SEQ 
ID NO:l T and (b) complements of (a). 

11. The isolated nucleic acid molecule of claim 4 f wherein 
the isolated nucleic acid molecule is selected from the group 
consisting of (a) at least 22 contiguous nucleotides of SEQ 
ID NO:l, and (b) complements of (a). 

12. The isolated nucleic acid molecule of claim 4, wherein 
the isolated nucleic acid molecule is selected from the group 
consisting of (a) between 12 and 32 contiguous nucleotides 
of SEQ ID NO:l t and (b) complements of (a). 

13. A host cell transformed or transfected with an expres- 
sion vector comprising the isolated nucleic acid molecule of 
any of claims 1, 2 or 3, operably linked to a promoter. 

***** 
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